REST API

GET /sessions

Returns all the active interactive sessions.

Request Parameters

NameDescriptionType
from The start index to fetch sessions int
size Number of sessions to fetch int

Response Body

NameDescriptionType
from The start index to fetch sessions int
total Number of sessions to fetch int
sessions Session list list

POST /sessions

Creates a new interactive Scala, Python, or R shell in the cluster.

Request Body

NameDescriptionType
kind The session kind[1] session kind
proxyUser User to impersonate when starting the session string
jars jars to be used in this session List of string
pyFiles Python files to be used in this session List of string
files files to be used in this session List of string
driverMemory Amount of memory to use for the driver process string
driverCores Number of cores to use for the driver process int
executorMemory Amount of memory to use per executor process string
executorCores Number of cores to use for each executor int
numExecutors Number of executors to launch for this session int
archives Archives to be used in this session List of string
queue The name of the YARN queue to which submitted string
name The name of this session string
conf Spark configuration properties Map of key=val
heartbeatTimeoutInSecond Timeout in second to which session be orphaned int
ttl The timeout for this inactive session, example: 10m (10 minutes) string

1: Starting with version 0.5.0-incubating this field is not required. To be compatible with previous versions users can still specify this with spark, pyspark or sparkr, implying that the submitted code snippet is the corresponding kind.

Response Body

The created Session.

GET /sessions/{sessionId}

Returns the session information.

Response Body

The Session.

GET /sessions/{sessionId}/state

Returns the state of session

Response

NameDescriptionType
id Session id int
state The current state of session string

DELETE /sessions/{sessionId}

Kills the Session job.

GET /sessions/{sessionId}/log

Gets the log lines from this session.

Request Parameters

NameDescriptionType
from Offset int
size Max number of log lines to return int

Response Body

NameDescriptionType
id The session id int
from Offset from start of log int
size Max number of log lines int
log The log lines list of strings

GET /sessions/{sessionId}/statements

Returns all the statements in a session.

Request Parameters

NameDescriptionType
from The start index to fetch sessions int
size Number of sessions to fetch int
order Provide value as "desc" to get statements in descending order (By default, the list is in ascending order) string

Response Body

NameDescriptionType
statements statement list list

POST /sessions/{sessionId}/statements

Runs a statement in a session.

Request Body

NameDescriptionType
code The code to execute string
kind The kind of code to execute[2] code kind

2: If session kind is not specified or the submitted code is not the kind specified in session creation, this field should be filled with correct kind. Otherwise Livy will use kind specified in session creation as the default code kind.

Response Body

The statement object.

GET /sessions/{sessionId}/statements/{statementId}

Returns a specified statement in a session.

Response Body

The statement object.

POST /sessions/{sessionId}/statements/{statementId}/cancel

Cancel the specified statement in this session.

Response Body

NameDescriptionType
msg is always "canceled" string

POST /sessions/{sessionId}/completion

Returns code completion candidates for the specified code in the session.

Request Body

NameDescriptionType
code The code for which completion proposals are requested string
kind The kind of code to execute[2] code kind
cursor cursor position to get proposals string

Response Body

NameDescriptionType
candidates Code completions proposals array[string]

GET /batches

Returns all the active batch sessions.

Request Parameters

NameDescriptionType
from The start index to fetch sessions int
size Number of sessions to fetch int

Response Body

NameDescriptionType
from The start index of fetched sessions int
total Number of sessions fetched int
sessions Batch list list

POST /batches

Creates a new batch session.

Request Body

NameDescriptionType
file File containing the application to execute path (required)
proxyUser User to impersonate when running the job string
className Application Java/Spark main class string
args Command line arguments for the application list of strings
jars jars to be used in this session list of strings
pyFiles Python files to be used in this session list of strings
files files to be used in this session list of strings
driverMemory Amount of memory to use for the driver process string
driverCores Number of cores to use for the driver process int
executorMemory Amount of memory to use per executor process string
executorCores Number of cores to use for each executor int
numExecutors Number of executors to launch for this session int
archives Archives to be used in this session List of string
queue The name of the YARN queue to which submitted string
name The name of this session string
conf Spark configuration properties Map of key=val

Response Body

The created Batch object.

GET /batches/{batchId}

Returns the batch session information.

Response Body

The Batch.

GET /batches/{batchId}/state

Returns the state of batch session

Response

NameDescriptionType
id Batch session id int
state The current state of batch session string

DELETE /batches/{batchId}

Kills the Batch job.

GET /batches/{batchId}/log

Gets the log lines from this batch.

Request Parameters

NameDescriptionType
from Offset int
size Max number of log lines to return int

Response Body

NameDescriptionType
id The batch id int
from Offset from start of log int
size Number of log lines int
log The log lines list of strings

REST Objects

Session

A session represents an interactive shell.

NameDescriptionType
id The session id int
appId The application id of this session string
owner Remote user who submitted this session string
proxyUser User to impersonate when running string
kind Session kind (spark, pyspark, sparkr, or sql) session kind
log The log lines list of strings
state The session state string
appInfo The detailed application info Map of key=val
jars jars to be used in this session list of strings
pyFiles Python files to be used in this session list of strings
files files to be used in this session list of strings
driverMemory Amount of memory to use for the driver process string
driverCores Number of cores to use for the driver process int
executorMemory Amount of memory to use per executor process string
executorCores Number of cores to use for each executor int
numExecutors Number of executors to launch for this session int
archives Archives to be used in this session List of string
queue The name of the YARN queue to which submitted string
conf Spark configuration properties Map of key=val

Session State

ValueDescription
not_started Session has not been started
starting Session is starting
idle Session is waiting for input
busy Session is executing a statement
shutting_down Session is shutting down
error Session errored out
dead Session has exited
killed Session has been killed
success Session is successfully stopped

Session Kind

ValueDescription
spark Interactive Scala Spark session
pyspark Interactive Python Spark session
sparkr Interactive R Spark session
sql Interactive SQL Spark session

Starting with version 0.5.0-incubating, each session can support all four Scala, Python and R interpreters with newly added SQL interpreter. The kind field in session creation is no longer required, instead users should specify code kind (spark, pyspark, sparkr or sql) during statement submission.

To be compatible with previous versions, users can still specify kind in session creation, while ignoring kind in statement submission. Livy will then use this session kind as default kind for all the submitted statements.

If users want to submit code other than default kind specified in session creation, users need to specify code kind (spark, pyspark, sparkr or sql) during statement submission.

pyspark

To change the Python executable the session uses, Livy reads the path from environment variable PYSPARK_PYTHON (Same as pyspark).

Starting with version 0.5.0-incubating, session kind “pyspark3” is removed, instead users require to set PYSPARK_PYTHON to python3 executable.

Like pyspark, if Livy is running in local mode, just set the environment variable. If the session is running in yarn-cluster mode, please set spark.yarn.appMasterEnv.PYSPARK_PYTHON in SparkConf so the environment variable is passed to the driver.

Statement

A statement represents the result of an execution statement.

NameDescriptionType
id The statement id integer
code The execution code string
state The execution state statement state
output The execution output statement output
progress The execution progress double
started The start time of statement code long
completed The complete time of statement code long

Statement State

ValueDescription
waiting Statement is enqueued but execution hasn't started
running Statement is currently running
available Statement has a response ready
error Statement failed
cancelling Statement is being cancelling
cancelled Statement is cancelled

Statement Output

NameDescriptionType
status Execution status string
execution_count A monotonically increasing number integer
data Statement output An object mapping a mime type to the result. If the mime type is ``application/json``, the value is a JSON value.

Batch

NameDescriptionType
id The session id int
appId The application id of this session string
appInfo The detailed application info Map of key=val
ttl The timeout for this inactive session, example: 10m (10 minutes) string
log The log lines list of strings
state The batch state string

Proxy User - doAs support

If superuser support is configured, Livy supports the doAs query parameter to specify the user to impersonate. The doAs query parameter can be used on any supported REST endpoint described above to perform the action as the specified user. If both doAs and proxyUser are specified during session or batch creation, the doAs parameter takes precedence.