e6data
e6data is a lakehouse compute engine, which runs high concurrency SQL analytics & AI workloads at 5-10x faster speed and over 50% lower TCO.
Product Documentation
Product Documentation
- Welcome to e6data: e6data is a lakehouse compute engine built to run high concurrency, complex SQL analytics and AI workloads—10x faster, 60% cheaper, zero data movement.
- Introduction to e6data: e6data is a lakehouse compute engine built to run high concurrency, complex SQL analytics and AI workloads—10x faster, 60% cheaper, zero data movement.
- Concepts: This section introduces the set of fundamental concepts and terminology used by e6data. Understanding these will help to use e6data effectively.
- Architecture: Understand how e6data is structured
- e6data in VPC Deployment Model
- Connect to e6data serverless compute
- Hybrid Data Lakehouse
- Get Started: Get ready to start querying!
- Sign Up: This article will help you to create your e6data account
- Setup: Setting up your cloud for e6data
- AWS Setup: The page provides setup guides for deploying e6data on AWS.
- In VPC Deployment (AWS)
- Prerequisite Infrastructure
- Infrastructure & Permissions for e6data
- Setup Kubernetes Components
- Setup using Terraform in AWS
- Update a AWS Terraform for your Workspace
- AWS PrivateLink and e6data
- VPC Peering | e6data on AWS: Manage and deploy AWS resources using the AWS provider in Terraform. Ensure correct setup using the official configuration guidelines.
- Connect to e6data serverless compute (AWS)
- Configuring Secure Access
- Overview
- Deployment Guide
- CloudFormation Script
- Workspace Creation: This page outlines the porcess of creating a workspace for connecting to e6data serverless compute
- Catalog Creation: The page outlines the process of creating a catalog for connecting to e6data serverless compute
- Glue Metastore
- Hive Metastore
- Unity Catalog
- Cluster Creation: This page outlines the process for creating a e6data servlerss compute cluster
- GCP Setup: The page provides setup guides for deploying e6data on GCP.
- In VPC Deployment (GCP): The Page Outlines the setps & Prerequistes for in VPC deployment for GCP
- Prerequisite Infrastructure
- Infrastructure & Permissions for e6data
- Setup Kubernetes Components
- Setup using Terraform in GCP: Deploying an e6data Workspace in GCP using Terraform
- Update a GCP Terraform for your Workspace
- Configure GCS Access for Serverless Compute (GCP)
- Prerequisites
- FAQ's and Troubleshooting
- Azure Setup
- In VPC Deployment (Azure): Deploying e6data Workspace in Microsoft Azure using Terraform
- Prerequisite Infrastructure
- Infrastructure & Permissions for e6data
- Setup Kubernetes Components
- Setup using Terraform in AZURE: Deploying e6data Workspace in Microsoft Azure using Terraform
- Update a AZURE Terraform for your Workspace
- Configure Azure Storage Access for Serverless Compute (Azure)
- Steps to be Performed by Customer Account
- FAQ's and Troubleshooting
- Workspaces: Understanding Workspaces in e6data
- Create Workspaces
- Enable/Disable Workspaces
- Update a Workspace
- Delete a Workspace
- Catalogs: Understanding Catalogs in e6data
- Create Catalogs
- Hive Metastore: Managing connections to Hive Metastore from e6data
- Connect to a Hive Metastore
- Edit a Hive Metastore Connection
- Delete a Hive Metastore Connection
- Glue Metastore: Creating e6data catalog using Glue Metastore
- Connect to a Glue Metastore
- Edit a Glue Metastore Connection
- Delete a Glue Metastore Connection
- Unity Catalog: Creating e6data catalog using Unity catalog
- Connect to Unity Catalog
- Edit Unity Catalog
- Delete Unity Catalog
- Apache Polaris
- Connect to Apache Polaris
- Edit Polaris Catalog
- Delete Polaris Catalog
- Cross-account Catalog Access
- Configure Cross-account Catalog to Access AWS Hive Metastore
- Configure Cross-account Catalog to Access Unity Catalog
- Configure Cross-account Catalog to Access AWS Glue
- Configure Cross-account Catalog to Access GCP Hive Metastore
- Manage Catalogs
- Privileges: Understanding Privileges in Catalog
- Access Control
- Column Masking
- Row Filter
- Table Formats
- Delta Lake
- Connect to Catalog
- Apache Iceberg
- Connect to Catalog
- Apache Hudi
- Connect to Catalog
- Clusters: Configure a cluster
- Edit & Delete Clusters: Change the configuration of cluster
- Suspend & Resume Clusters
- Cluster Size: Understand the cluster sizes offered during cluster creation.
- Load Based Sizing: This page describes about load based sizing in the cluster.
- Auto Suspension: This page describes about the auto suspension feature in the cluster.
- Query Timeout: This page describes about the query timeout feature in the cluster.
- Monitoring
- Connection Info
- Pools: This page outlines about Pools
- Delete Pools: This page outlines Pool deletion and Pool permissions.
- Query Editor: Run queries using the e6data Query Editor
- Editor Pane: This page explains the features and functionalities of the Editor Pane in e6data's Query Editor.
- Results Pane: This page explains how to view and interpret query results in the e6data Query Editor.
- Schema Explorer: This page explains how to navigate and interact with the Schema Explorer in the Query Editor.
- Data Preview: This page explains how to preview dataset samples within the Query Editor.
- Notebook: Run queries using the e6data Query Notebook
- Editor Pane: This page explains the features and functionalities of the Notebook Editor Pane in e6data.
- Results Pane: This page explains how to view and interpret query results in the e6data Notebook.
- Schema Explorer: This page explains how to navigate and interact with the Schema Explorer in the Notebook.
- Data Preview: This page explains how to preview dataset samples within the Notebook
- Query History: This page explains how to view, manage, and track past queries executed in the e6data Query Editor.
- Query Count API: Get insights into query execution volume over time in e6data.
- Connectivity: This page explains the network and integration options available for connecting to e6data.
- IP Sets: This page explains how to manage allowed IP ranges for secure access to e6data.
- Endpoints: This page explains how to configure and manage network endpoints for connecting to e6data.
- Cloud Resources: This page explains how to configure and manage cloud-based resources for e6data connectivity and storage.
- Network Firewall: The e6data Network Firewall feature allows users to manage IP whitelisting, enabling or restricting access to e6data at the cluster level.
- Access Control: Overview of Access Control mechanisms in e6data.
- Users: Managing user accounts.
- Groups: Grouping users for easier management.
- Roles: Assign roles and permission to users.
- Permissions: Manage user access efficiently by grouping permissions in e6data.
- Policies: Define and enforce access control rules.
- Single Sign-On (SSO): Enable seamless and secure authentication for e6data using SSO.
- AWS SSO: Configure AWS Single Sign-On for secure authentication and access management.
- Okta: Set up Okta for seamless Single Sign-On (SSO)
- AWS Cognito Integration (OAuth 2.0)
- MICROSOFT ENTRA ID: Configure Single Sign-On using Microsoft My Apps for streamlined authentication.
- Icons for IdP: Customize identity provider (IdP) icons for better user recognition.
- Service Accounts: Manage automated access with dedicated service accounts.
- Multi-Factor Authentication (Beta): Enhance security with an additional verification layer.
- Usage and Cost Management: Usage and Cost Management tracks resource usage and optimizes costs for efficient operations.
- Audit Log: Audit Log feature of e6data , will help you navigate and utilise the Audit Logs effectively to track administrative actions related to workspaces, catalogs, and clusters.
- User Settings: Manage profile details,
- Profile: This page describes about the user profile.
- Personal Access Tokens (PAT): This page describes authentication using e6data personal access tokens.
- Advanced Features: Access extended settings and configurations
- Cross-Catalog & Cross-Schema Querying: Execute queries across multiple catalogs and schemas seamlessly.
- Supported Data Types: This document contains the datatypes supported by e6data
- SQL Command Reference: e6data supports the following categories of functions:
- Query Syntax: Guidelines and structure for writing queries effectively.
- General functions: Commonly used functions for data processing.
- Aggregate Functions: Aggregate functions operate on multiple sets of values and return a single value.
- Mathematical Functions & Operators: This page contains the Mathematical functions and operators supported by e6data.
- Arithematic Operators: Perform mathematical operations in queries.
- Rounding and Truncation Functions: Adjust numerical values by rounding or truncating.
- Exponential and Root Functions: Perform exponential calculations and extract roots of numbers.
- Trigonometric Functions: Compute sine, cosine, tangent, and other trigonometric values.
- Logarithmic Functions: Calculate logarithms using various bases for numerical analysis.
- String Functions: This document contains the string functions supported by e6data.
- Date-Time Functions: This document contains the date-time functions supported by e6data
- Constant Functions: Return fixed values that remain unchanged in computations.
- Conversion Functions: Transform data types and formats for compatibility.
- Date Truncate Function: Trim date and time values to a specified precision.
- Addition and Subtraction Functions: Perform arithmetic operations on numerical and date values.
- Extraction Functions: Retrieve specific components from dates, times, and strings.
- Format Functions: Modify the appearance of dates, numbers, and text values
- Timezone Functions: Convert and manipulate timestamps across different time zones.
- Conditional Expressions: Execute logic-based operations based on specified conditions.
- Conversion Functions: This page contains the explicit conversion functions supported by e6data.
- Window Functions: This page contains window functions supported by e6data.
- Comparison Operators & Functions: This page contains the Comparison operators supported by e6data.
- Logical Operators: This page contains logical operators supported by e6data.
- Statistical Functions: Uncategorized additional functions supported by e6data
- Bitwise Functions: Bitwise functions supported by e6data
- Array Functions: Array functions supported by e6data
- Regular Expression Functions: Perform pattern matching and text manipulation using regex.
- Generate Functions: Create sequences, arrays, or structured data dynamically.
- Cardinality Estimation Functions: pproximate the number of unique elements in a dataset.
- JSON Functions: Parse, manipulate, and extract data from JSON structures.
- Checksum Functions: Generate and verify hash values for data integrity.
- Unload Function (Copy into): Export query results to external storage efficiently.
- Struct Functions: Work with structured data by creating and manipulating nested fields.
- Geospatial Functions
- Equivalent Functions & Operators: Compare values and expressions for equality and similarity.
- Connectors & Drivers: Integrate with external systems using supported connectors and drivers.
- DBeaver: Connect and interact with databases using the DBeaver SQL client.
- DbVisualiser: Access and manage databases using the DbVisualizer tool.
- Apache Superset: Visualize and explore data with interactive dashboards and charts.
- Jupyter Notebook: Execute queries and analyze data within an interactive notebook environment.
- Tableau Cloud: Connect and visualize data using Tableau’s cloud-based analytics platform.
- Tableau Desktop: Analyze and visualize data locally with Tableau’s desktop application.
- Power BI: Power BI is a Microsoft business intelligence platform that allows users to visualize and analyze data from various sources, including SQL databases.
- Setting up Power BI on-premises Gateway: Set up Power BI Gateway to connect Power BI with e6data for secure and seamless reporting.
- Metabase: Explore and visualize data using an open-source business intelligence tool.
- Zeppelin: Perform interactive data analytics with Apache Zeppelin notebooks.
- Python Connector: Integrate and interact with data using the Python API.
- Performance and Integration Guide
- Code Samples: Python code snippets to carry out common operations on e6data
- JDBC Driver: Connect to databases using the Java Database Connectivity (JDBC) standard.
- Code Samples: Java code snippets to carry out common operations on e6data via JDBC Driver
- API Support: Access and interact with data programmatically using REST APIs.
- Configure Cluster Ingress: Securely enabling ingress to e6data clusters for external services
- ALB Ingress in Kubernetes: Configuring ALB Ingress in Kubernetes
- GCE Ingress in Kubernetes: Configuring GCE Ingress In Kubernetes
- Ingress-Nginx in Kubernetes: Configuring Ingress-Nginx in Kubernetes
- PySpark Compatibility
- Getting started
- Code samples
- DataFrame Operations
- SQL Functions
- Security & Trust: Ensure data protection, compliance, and secure access controls.
- Best Practices: Best practices to manage your e6data deployment
- AWS Best Practices
- Features & Responsibilities Matrix: Define roles and access levels for various features.
- Data Protection Addendum(DPA): DATA PROTECTION ADDENDUM
- Tutorials and Best Practices: This page helps you to understand on how to use e6data platform and
- How to configure HIVE metastore if you don't have one?: This article will guide you on how to set up a HIVE metastore in case you don't have a metastore.
- How-To Videos: Tutorial videos on how to carry out common operations in the e6data platform.
- Known Limitations: Identify current constraints and restrictions in the system.
- SQL Limitations: Understand constraints and unsupported features in SQL execution.
- Other Limitations: Recognize additional constraints affecting functionality and performance.
- Restart Triggers: Manage and configure conditions for automatic process restarts.
- Cloud Provider Limitations: Understand restrictions imposed by different cloud platforms.
- Error Codes: This page consists of all the errors that are displayed on the screen.
- General Errors: Identify and resolve common system and user-related issues.
- User Account Errors: Troubleshoot authentication, access, and profile-related issues.
- Workspace Errors: Diagnose and resolve issues related to workspace setup and usage.
- Catalog Errors: Address issues encountered while managing catalogs and metadata.
- Cluster Errors: Troubleshoot failures and performance issues in cluster operations.
- Data Governance Errors: Resolve issues related to access control, policies, and compliance.
- Query History Errors: Address issues with logging, tracking, and retrieving query history.
- Query Editor Errors: Troubleshoot issues related to query execution and interface functionality.
- Pool Errors: Identify and resolve issues affecting resource pooling and allocation.
- Connectivity Errors: Troubleshoot network and connection-related issues.
- Terms & Condition: Understand the rules and policies governing usage and access.
- Privacy Policy: Learn how data is collected, stored, and protected.
- Cookie Policy: Understand how cookies are used for functionality and analytics.
- FAQs: Find answers to common questions about features, usage, and troubleshooting.
- Workspace Setup: Frequently Asked Questions about Workspace
- Security: Frequently Asked Questions about Security
- Catalog Privileges: Frequently Asked Questions about Catalog Privileges
- Services Utilised for e6data Deployment: This page outlines the key services and resources required for deploying e6data on various cloud platforms.
- AWS supported regions: This page lists the AWS regions where e6data can be deployed, ensuring optimal performance and compliance with regional requirements.
- GCP supported regions: This article lists the regions supported by e6data in GCP
- AZURE supported regions: This article lists the regions supported by e6data in AZURE.
- Release Notes & Updates: New features, announcements & bug fixes
- 6th August 2025
- 23rd July 2025
- 6th Sept 2024: Enhanced Data Analyst role
- 6th June 2024: Latest update- MFA and service accounts.
- 18th April 2024: This page covers the latest updates to e6data, including improved connectivity, security, and query editing features.
- 9th April 2024: This page explains the latest updates to e6data, focusing on schema behavior within the Schema Explorer for improved navigation and management.
- 30th March 2024: This page covers recent e6data updates, including catalog privileges for column masking and row filtering, query history enhancements, and support for deletion vectors in Iceberg tables.
- 16th March 2024: This page highlights the new feature for exporting query history to CSV, enhancing data analysis and reporting capabilities on the e6data platform.
- 14th March 2024: This page covers the latest updates to e6data, including the new DataExport role, improved resource selection in catalog privileges, and the addition of client-perceived time in query history.
- 12th March 2024: recent e6data updates, including enhanced cluster connection info, improved connectivity, and catalog refresh capabilities for Data Analyst roles.
- 2nd March 2024: Covers the latest e6data updates, including catalog privileges (Beta), support for liquid clustering, new functions, and the impersonation feature for the Metabase BI tool via Apache Ranger.
- 10th February 2024: This page introduces the new gateway connectivity feature, enabling seamless external client connections through endpoints on the e6data platform.
- 3rd February 2024: This page covers the new connectivity and notebook features on the e6data platform.
- 17th January 2024: This page highlights the introduction of new functionalities on the e6data platform.
- 9th January 2024: This page covers the latest e6data enhancements, including new functionalities, bug fixes, and performance optimizations.
- 3rd January 2024: This page outlines the platform enhancement and catalog auto-refresh feature
- 18th December 2023: This page covers backend and platform enhancements, including improved security for monitoring components and enhanced access control.
- 12th December 2023: This page introduces the platform enhancement, including the new Find and Replace feature in the query editor.
- 9th December 2023: Plugin, platform, engine enhancements.
- 4th December 2023: Improvements to query Editor session handling.
- 27th November 2023: UI enhancements.
- 8th September 2023: User roles and privileges.
- 4th September 2023: Covers the new features, including the ability to enable and disable workspaces, along with UI enhancements.
- 26th August 2023: covers pod metrics, workspace updates, editable query history views, and UI enhancements.
- 21st August 2023: This page covers the new query editor roles, query resume, customizable views, catalog editing, and external access to clusters.
- 19th July 2023: This page highlights query editor enhancements, schema search, an enhanced query run button, and visual/UI improvements.
- 23rd May 2023: Covers the introduction of the Workspace Admin role, multi-catalog support, SQL optimizations, bug fixes, and known limitations.
- 5th May 2023: page covers tab management, cluster tag improvements, SQL optimizations, bug fixes, and known limitations.
- 28th April 2023: This page highlights data preview and query editor UI improvements.
- 19th April 2023: This page covers IP allowlisting for external access.
- 15th April 2023: This page highlights cross-account catalog access, execution planner, and query editor enhancements.
- 10th April 2023: Covers auto save in query editor and SSO login via e6data portal and Google IdP.
- 30th March 2023: This page covers SSO support, AWS S3 Gateway Endpoints, and parquet data pruning.
Agent Instructions: Querying This Documentation
If you need additional information, you can query the documentation dynamically by asking a question.
Perform an HTTP GET request on a page URL with the ask query parameter:
GET https://docs.e6data.com/product-documentation/welcome-to-e6data.md?ask=<question>
The question should be specific, self-contained, and written in natural language. The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.
Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
Meet the modern standard for public facing documentation. Beautiful out of the box, easy to maintain, and optimized for user engagement.
Search through billions of items for similar matches to any object, in milliseconds. It’s the next generation of search, an API call away.
Build and deploy reliable background jobs with no timeouts and no infrastructure to manage.
Get the simple developer experience of SQLite in production, and scale your multi-tenant backend with unlimited databases.
Upstash is a serverless data platform providing low latency and high scalability for real-time applications.
One-click deployments built for teams, tuned for Laravel, loaded with tools and goodies you're going to love.