Importing tables as datasets#

The “import tables as datasets” feature is available through the API, both for Hive and SQL tables

Importing SQL tables#

import dataiku

client = dataiku.api_client()
project = client.get_default_project()

import_definition = project.init_tables_import()
import_definition.add_sql_table("my_sql_connection", "schema_of_the_databse", "name_of_the_table")

prepared_import = import_definition.prepare()
future = prepared_import.execute()

import_result = future.wait_for_result()

Importing Hive tables#

import dataiku

client = dataiku.api_client()
project = client.get_default_project()

import_definition = project.init_tables_import()
import_definition.add_hive_table("hdfs_managed", "hive_table_name")

prepared_import = import_definition.prepare()
future = prepared_import.execute()

import_result = future.wait_for_result()

Reference documentation#

Classes#

dataikuapi.dss.project.TablesImportDefinition(...)

Temporary structure holding the list of tables to import

dataikuapi.dss.project.TablesPreparedImport(...)

Result of preparing a tables import.

Functions#

add_hive_table(hive_database, hive_table)

Add a Hive table to the list of tables to import

add_sql_table(connection, schema, table[, ...])

Add a SQL table to the list of tables to import

execute()

Starts executing the import in background and returns a dataikuapi.dss.future.DSSFuture to wait on the result

init_tables_import()

Start an operation to import Hive or SQL tables as datasets into this project

prepare()

Run the first step of the import process.