Node.js SDK for Joyent Manta

This is the reference documentation for the Manta Node.js SDK. Manta is Joyent's Object Storage Service, which enables you to store data in the cloud and process that data using a built-in compute facility.

This document explains the Node.js API interface and describes the various operations, structures and error codes.

Conventions

Any content formatted like this:

curl is https://us-east.manta.joyent.com

is a command-line example that you can run from a shell. All other examples and information are formatted like this:

client.ls('/jill/stor/foo', function (err, res) {
    assert.ifError(err);
    ...
});

Installation

First, install the SDK as usual via npm; the package name in npm is manta. You may optionally want to install the package globally with the -g flag to npm, as this should place the node-manta CLI in your $PATH.

npm install manta

Once you've installed the npm package, there a few environment variables that are useful to set if you plan to work with the CLI; these environment variables are not strictly necessary, but they will save you passing in command line options on each invocation. The environment variables that can be set are your SmartDataCenter login name and ssh public key fingerprint (manta uses the same credentials), and the URL of which manta endpoint you wish to interact with. The commands below assume that your SSH public key is the default id_rsa.pub key, located in your $HOME/.ssh directory (on Mac OS X and UNIX environments). The shell command below simply parses the SSH fingerprint and sets that in the requisite environment variable.

export MANTA_KEY_ID=`ssh-keygen -l -f ~/.ssh/id_rsa.pub | awk '{print $2}' | tr -d '\n'`
$ export MANTA_URL=https://us-east.manta.joyent.com/
$ export MANTA_USER=jill

Creating a Client

In order to create a client, use the createClient API available on the top-level of the SDK. The example below assumes that you are using the environment variables you set above.

var assert = require('assert');
var manta = require('manta');

var client = manta.createClient({
    sign: manta.privateKeySigner({
        key: fs.readFileSync(process.env.HOME + '/.ssh/id_rsa', 'utf8'),
        keyId: process.env.MANTA_KEY_ID,
        user: process.env.MANTA_USER
    }),
    user: process.env.MANTA_USER,
    url: process.env.MANTA_URL
});
assert.ok(client);

console.log('client setup: %s', client.toString());

The options you can pass into createClient are:

NameJS TypeDescription
connectTimeoutNumber(optional): amount of milliseconds to wait for acquiring a socket to Manta; defaults to 0 (infinity)
logObject(optional): bunyan logger; default is at level fatal and writes to stderr
headersObject(optional): HTTP headers to send on all requests
signFunction(required): see authenticating requests below
urlString(required): URL to interact with Manta on
userString(optional): login name to use when interacting with the jobs API

Authenticating Requests

When creating a manta client, you'll need to pass in a callback function for the sign parameter. node-manta ships with two functions that will likely suit your need: privateKeySigner and sshAgentSigner. Both of these callbacks will automatically do the correct crypto for authenticating manta requests, the difference is that privateKeySigner expects (non-passphrase protected) keys to be passed in directly (as a file name), whereas sshAgentSigner will load your credentials on each request from the SSH agent (if available). Both callbacks require you to set the manta user (login) and keyId (SSH key fingerprint).

Note that the sshAgentSigner is not suitable for server applications, or any other system where the performance degradation necessary to interact with SSH is not acceptable; put another way, you should only use it for interactive tooling, such as the CLI that ships with node-manta.

Should you wish to write a custom plugin, the expected implementation of the sign callback is a function of the form function (string, callback). string is generated by node-manta (typically the value of the Date header), and callback is of the form function (err, object), where object has the following properties:

{
    algorithm: 'rsa-sha256',   // the signing algorithm used
    keyId: '7b:c0:5c:d6:9e:11:0c:76:04:4b:03:c9:11:f2:72:7f', // key fingerprint
    signature: $base64_encoded_signature,  // the actual signature
    user: 'mark'   // the user to issue the call as.
}

Use-cases where you would need to write your own signer are things like signing with a smart-card or other HSM, making remote calls to a central system, etc.

Presigned URLs

In some cases you may want your app to be able to generate a full URL, suitable for giving out to others as a link. In these cases, you can use the presigned URL approach, and set an expires parameter. node-manta has a simple API for this that utilizes the same sign callback as all other APIs, but simply does the correct canonicalization for a URL:

    var manta = require('manta');

    var opts = {
        algorithm: 'RSA-SHA256',
        expires: Date.now() + 3600, // epoch time
        host: 'manta.us-east.joyentcloud.com',
        keyId: process.env.MANTA_KEY_ID,
        path: '/mark/stor/my_image.png',
        sign: manta.privateKeySigner({
            key: process.env.HOME + '/.ssh/id_rsa',
            keyId: process.env.MANTA_KEY_ID,
            user: process.env.MANTA_USER
        }),
        user: process.env.MANTA_USER
    };

    manta.signUrl(opts, function (err, resource) {
        assert.ifError(err);

        console.log('https://us-east.manta.joyent.com' + resource);
    });

Common API options

All APIs in node-manta have the last two options of the function set to options and callback, where options is (usually) optional. For example, these two calls to info are identical:

var opts = {};
client.info('/jill/stor/foo', opts, function (err, info) {
    assert.ifError(err);
    ...
});

client.info('/jill/stor/foo', function (err, info) {
    assert.ifError(err);
    ...
});

If you are not passing in explicit options, the second form is always there for convenience. All API operations allow you to pass in a standard set of options, which are:

NameJS TypeDescription
headersObjectAny HTTP headers to be included in this request
req_idStringA unique identifier for this request (SHOULD be a uuid)
queryObjectA key/value set of parameters to be encoded on the URL's query string

You can always override any node-manta behavior by passing in explicit HTTP headers, but in most cases, you should just use the "higher-level" parameters available in the specific API you are interested in.

Common Callback Parameters

In almost all cases (the exception being the "streaming" APIs like ls) callbacks will be of the form function (error, result), where err is either a JavaScript Error object or null. result is a standard node http.ClientResponse object, where you will be able to access HTTP headers, response codes, etc. Note that if there was an HTTP response code >= 400, then err will be present and filled in with the Manta error code and message (see errors).

Errors

All callback functions may return a Javascript Error object. In most cases, you can simply switch on err.name, which will be correctly filled in from server error codes sent back. The only cases where you cannot are lower-level errors such as ECONNREFUSED that are generated by the node.js runtime. The complete list of manta error names is:

Directories

client.mkdir(path, [options], callback)

Create or overwrite a directory at path. mkdir is really a PUT operation, so it's slightly different semantics than mkdir(2) in POSIX (meaning, you can call mkdir on the same path twice). There is no return value besides a potential error.

    client.mkdir('/jill/stor/foo', function (err) {
        assert.ifError(err);
        ...
    });

Inputs

NameJS TypeDescription
directoryString(required) A full Manta path to create
optionsObject(optional) optional overrides for this request
callbackFunction(required) callback of the form fn(err, res)

client.mkdirp(path, [options], callback)

Same as mkdir, except, mkdirp creates intermediate directories as required.

    client.mkdirp('/jill/stor/foo/bar/baz', function (err) {
        assert.ifError(err);
        ...
    });

Inputs

NameJS TypeDescription
directoryString(required) A full Manta path to create
optionsObject(optional) optional overrides for this request
callbackFunction(required) callback of the form fn(err, res)

client.ls(path, [options], callback)

Lists directory contents. This API will return an EventEmitter that will emit a stream of entries as they are returned from the server. You can listen for two distinct types directory; records of type object will have slightly more information in the records. Both records will have a type field in them. Otherwise, the returned entries are described below. Optional pagination parameters can be included in the options block, and act as you would expect. There is a server-enforced limit of 1000 entries per list request, which is also the default limit, however you can request a smaller size if need be. You can also choose to only receive objects of a certain type.

var opts = {
    offset: 0,
    limit: 256,
    type: 'object'
};
client.ls('/', opts, function (err, res) {
    assert.ifError(err);

    res.on('object', function (obj) {
        console.log(obj);
    });

    res.on('directory', function (dir) {
        console.log(dir);
    });

    res.once('error', function (err) {
        console.error(err.stack);
        process.exit(1);
    });

    res.once('end', function () {
        console.log('all done');
    });
});

Inputs

NameJS TypeDescription
directoryString(required) A full Manta path to list
optionsObject(optional) optional overrides for this request
callbackFunction(required) callback of the form fn(err, res)

Output Objects

Each output object will be of this schema:

{
    name: 'foo',                            // basename of the entry
    etag: 'AABBCC',                         // only set on objects
    size: 1234,                             // only set on objects; valueOf(content-length)
    type: 'object',                         // one of directory || object
    mtime: '2012-11-09T12:34:56Z'           // ISO8601 timestamp of the last update time
}

Objects

client.put(path, stream, options, callback)

Creates or overwrites an (object) key. You pass it in a ReadableStream (note that stream must support pause/resume), and upon receiving a 100-continue from manta, the bytes get blasted up.

In this API, you can either pass in an actual 'size' attribute in the options object. If you set that, that is the content-length header for this request. If you don't set that, the request will be "streaming" (transfer-encoding=chunked), in which case your object either needs to fit into the "default" object size (5Gb currently), OR you need to pass in a header of max-content-length, which will be the _maximum_ size of your data. Additionally, you can/should pass in an 'md5' attribute, and you can pass a 'type' attribute which is really the content-type. If you don't pass in 'type', this API will try to guess it based on the name of the object (using the extension). Lastly, you can pass in a 'copies' attribute, which sets the number of full object copies to make server side (default is 2).

However, like the other APIs, you can additionally pass in extra headers, etc. in the options object as well. In the case of objects this is particularly useful for setting CORS headers, for example.

There is no return value besides error reporting.

Note: The example below uses the memorystream module from NPM.

var crypto = require('crypto');
var MemoryStream = require('memorystream');

var message = 'Hello World'
var opts = {
    copies: 3,
    headers: {
        'access-control-allow-origin': '*',
        'access-control-allow-methods': 'GET'
    },
    md5: crypto.createHash('md5').update(message).digest('base64'),
    size: Buffer.byteLength(message),
    type: 'text/plain'
};
var stream = new MemoryStream();

client.put('/jill/stor/hello_world.txt', stream, opts, function (err) {
    assert.ifError(err);
    ...
});

stream.end(message);

Inputs

NameJS TypeDescription
pathString(required) A full Manta path to write to
streamStream(required) An instance of a ReadableStream
optionsObject(required) overrides for this request; must include size
callbackFunction(required) callback of the form fn(err, res)

client.createWriteStream(path, options)

Essentially the same API/logic as client.put, but idiomatic to node the node streams model. path and options are the same as put, but this API takes no callback, and instead returns an instance of stream.Writable.

Note that standard node stream semantics don't line up to when Manta has actually committed data, so the stream returned by this API emits a close event that also has the http.Response object.

Note: The example below uses the memorystream module from NPM.

var MemoryStream = require('memorystream');

var message = 'Hello World'
var opts = {
    copies: 3,
    headers: {
        'access-control-allow-origin': '*',
        'access-control-allow-methods': 'GET'
    },
    md5: crypto.createHash('md5').update(message).digest('base64'),
    size: Buffer.byteLength(message),
    type: 'text/plain'
};
var stream = new MemoryStream();
var w = client.createWriteStream('/jill/stor/hello_world.txt', opts);

stream.pipe(w);

w.once('close', function (res) {
    console.log('all done');
});

stream.end(message);

Inputs

NameJS TypeDescription
pathString(required) A full Manta path to write to
optionsObject(required) overrides for this request; must include size

client.get(path, [options], callback)

Fetches an object back from Manta, and gives you a (standard) ReadableStream.

Note this API will validate ContentMD5, and so if the downloaded object does not match, the stream will emit an error.

client.get('/jill/stor/hello_world.txt', function (err, stream) {
    assert.ifError(err);

    stream.setEncoding('utf8');
    stream.on('data', function (chunk) {
        console.log(chunk);
    });
    stream.on('end', function () {
        ...
    });
});

Inputs

NameJS TypeDescription
pathString(required) A full Manta path to fetch
optionsObject(optional) overrides for this request
callbackFunction(required) callback of the form fn(err, stream)

client.createReadStream(path, [options])

Fetches an object as a ReadableStream; this API is basically identical to get, except it's idiomatic to node streaming. Additionally, the returned stream will emit close at the end of request data along with the HTTP Response object.

var stream = client.get('/jill/stor/hello_world.txt');
stream.pipe(stdout);
stream.once('close', function (res) {
    console.error(res.statusCode);
});

Inputs

NameJS TypeDescription
pathString(required) A full Manta path to read
optionsObject(optional) overrides for this request

Links

client.ln(source, path, [options], callback)

Creates a new link (key) to the source object.

Inputs

NameJS TypeDescription
sourceString(required) Full path to the original object.
pathString(required) Full path to the new link.
optionsObject(optional) overrides for this request
callbackFunction(required) callback of the form fn(err, stream)

There is no return value besides a possible error.

client.ln('/jill/stor/hello_world.txt', '/jill/stor/hola_mundo.txt'  function (err) {
    assert.ifError(err);

    ...
});

client.unlink(path, [options], callback)

Deletes an object or directory from Manta. If path points to a directory, the directory must be empty.

There is no return value besides a possible error.

client.unlink('/jill/stor/hello_world.txt', function (err) {
    assert.ifError(err);

    ...
});

Inputs

NameJS TypeDescription
directoryString(required) A full Manta path to delete
optionsObject(optional) overrides for this request
callbackFunction(required) callback of the form fn(err, stream)

Jobs

client.createJob(job, [options], callback)

Creates a new compute job in Manta.

This API is fairly flexible about what it takes, but really the best thing is for callers to just fully spec out the JSON object, like so:

{
  name: "word count",
  phases: [ {
    exec: "wc"
  }, {
    type: "reduce",
    exec: "awk '{ l += $1; w += $2; c += $3 } END { print l, w, c }'"
  } ]
}

job should be a JSON object that specifies at minimum an Array of phases. As described elsewhere, phases should be a set of objects that define your map/reduce tasks.

That being said, for simple jobs this API allows you to 'cheat' a little bit to get started by just taking in simple strings:

createJob("grep foo", function (err, jobId) {
    assert.ifError(err);
    ...
});

createJob(["grep foo", "grep bar"], function (err, jobId) {
    assert.ifError(err);
    ...
});

Note this form is only useful for map only jobs; you cannot specify reduce tasks in this way.

options allows you to set arbitrary headers (as usual), and callback is of the form function (err, jobId). jobId will be the server-created id for this job, which you can pass into the other job related APIs.

Inputs

NameJS TypeDescription
jobObect(required) A job definition object, as described below
optionsObject(optional) optional overrides for this request
callbackFunction(required) callback of the form fn(err, jobId)

The full set of allowed options for job:

NameJS TypeDescription
nameString(optional) An arbitrary name for this job
inputString(optional) An arbitrary jobId to pipe from
phasesArray(required) tasks to execute as part of this job

phases must be an Array of Object, where objects have the following properties:

NameJS TypeDescription
typeString(optional) one of: map or reduce
assetsArray[String](optional) an array of manta keys to be placed in your compute zones
execString(required) the actual (shell) statement to execute
countNumber(optional) an optional number of reducers for this phase (reduce-only): default is 1
memoryNumber(optional) an optional amount of DRAM to give to your compute zone

Output

Output is simply a String job id.

client.job(jobId, [options], callback)

Retrieves a job from Manta. This is the "overall" object, and will not contain input/output keys or failures.

client.job('d095fd4a-3a3d-11e2-b5f1-7be876f9c2b5', function (err, job) {
    assert.ifError(err);
    ...
});

options allows you to set arbitrary headers (as usual), and callback is of the form function (err, job). job will be the job object you used in createJob with a few additional fields:

Inputs

NameJS TypeDescription
jobIdString(required) A job id
optionsObject(optional) optional overrides for this request
callbackFunction(required) callback of the form fn(err, jobId)

Output

Output is a job object, that has the additional properties described above.

client.jobs([options], callback)

Lists all jobs for a user. This will stream back the full set of all jobs for a user. Currently you can filter on state by passing state into options.

client.jobs({state: 'running'}, function (err, res) {
    assert.ifError(err);

    res.on('job', function (j) {
        console.log('%j', j);
    });
});

Inputs

NameJS TypeDescription
optionsObject(optional) optional overrides for this request
callbackFunction(required) callback of the form fn(err, jobId)

Output

Output is an EventEmitter; listen for job, error and end.

client.addJobKey(jobId, key, [options], callback)

Submits job key(s) to an existing job in Manta. key can be either a single key or an array of keys.

The keys should be fully specified paths to manta objects:

var keys = [
    '/mark/stor/foo',
    '/dave/stor/bar'
];
client.addJobKey('d095fd4a-3a3d-11e2-b5f1-7be876f9c2b5', keys, function (err) {
    assert.ifError(err);
});

In the options block, in aaddition to the usual stuff, you can pass end: true to close input for this job (so you can avoid calling endJob).

There is no return object besides a possible error.

Inputs

NameJS TypeDescription
jobIdString(required) A job id
keysArray[String](required) A list of keys to submit to the job
optionsObject(optional) optional overrides for this request
callbackFunction(required) callback of the form fn(err, jobId)

client.endJob(jobId, [options], callback)

Closes input for a job, and allows a job to either finish or transition to reduce phases (and then finish).

There is no return object besides a possible error.

client.endJob('d095fd4a-3a3d-11e2-b5f1-7be876f9c2b5', function (err) {
    assert.ifError(err);
});

Inputs

NameJS TypeDescription
jobIdString(required) A job id
optionsObject(optional) optional overrides for this request
callbackFunction(required) callback of the form fn(err, jobId)

client.cancelJob(jobId, [options], callback)

Cancels a job, which means input will be closed, and all processing will be cancelled. You should not expect output from a job that has been cancelled.

There is no return object besides a possible error.

client.cancelJob('d095fd4a-3a3d-11e2-b5f1-7be876f9c2b5', function (err) {
    assert.ifError(err);
});

Inputs

NameJS TypeDescription
jobIdString(required) A job id
optionsObject(optional) optional overrides for this request
callbackFunction(required) callback of the form fn(err, jobId)

client.jobInput(jobId, [options], callback)

Retrieves all successfully submitted input keys for a job as a stream.

client.jobInput('d095fd4a-3a3d-11e2-b5f1-7be876f9c2b5', function (err, res) {
    assert.ifError(err);

    res.on('key', function (k) {
        console.log('Input key: %s', k);
    });

    res.once('end', function () {
        console.log('done');
    });
});

Inputs

NameJS TypeDescription
jobIdString(required) A job id
optionsObject(optional) optional overrides for this request
callbackFunction(required) callback of the form fn(err, jobId)

Output

Output is an EventEmitter; listen for key, error and end.

client.jobOutput(jobId, [options], callback)

Retrieves all successfully written output keys for a job as a stream.

client.jobOutput('d095fd4a-3a3d-11e2-b5f1-7be876f9c2b5', function (err, res) {
    assert.ifError(err);

    res.on('key', function (k) {
        console.log('Output key: %s', k);
    });

    res.once('end', function () {
        console.log('done');
    });
});

Inputs

NameJS TypeDescription
jobIdString(required) A job id
optionsObject(optional) optional overrides for this request
callbackFunction(required) callback of the form fn(err, jobId)

Output

Output is an EventEmitter; listen for key, error and end.

client.jobFailures(jobId, [options], callback)

Retrieves all input keys that had failures, as a stream.

client.jobFailures('d095fd4a-3a3d-11e2-b5f1-7be876f9c2b5', function (err, res) {
    assert.ifError(err);

    res.on('key', function (k) {
        console.error('Input key %s failed', k);
    });

    res.once('end', function () {
        console.log('done');
    });
});

Inputs

NameJS TypeDescription
jobIdString(required) A job id
optionsObject(optional) optional overrides for this request
callbackFunction(required) callback of the form fn(err, jobId)

Output

Output is an EventEmitter; listen for key, error and end.

client.jobErrors(jobId, [options], callback)

Retrieves all errors for a job:

client.jobErrors('d095fd4a-3a3d-11e2-b5f1-7be876f9c2b5', function (err, res) {
    assert.ifError(err);

    res.on('err, function (e) {
        console.error('%j', e);
    });

    res.once('end', function () {

    });
});

Inputs

NameJS TypeDescription
jobIdString(required) A job id
optionsObject(optional) optional overrides for this request
callbackFunction(required) callback of the form fn(err, jobId)

Output

Output is an EventEmitter; listen for err, error and end.

The err object has the following properties:

NameJS TypeDescription
idStringjob id
phaseNumberphase number of the failure
whatStringa human readable summary of what failed
codeStringprogrammatic error code
messageStringhuman readable error message
stderrString(optional) a manta key that saved the stderr for the given command
keyString(optional) the input key being processed when the task failed (if manta can determine it)