Analytics
The analytics package provides data science and analytics platforms for interactive analysis, distributed processing, and data governance.
Services
| Service | Description | Deploy |
|---|---|---|
| JupyterHub | Multi-user Jupyter notebooks with PySpark | ./uis deploy jupyterhub |
| OpenMetadata | Data discovery, governance, and metadata platform | ./uis deploy openmetadata |
| Apache Spark | Kubernetes-native distributed processing | ./uis deploy spark |
| Unity Catalog | Data catalog and governance | ./uis deploy unity-catalog |
Quick Start
./uis stack install analytics
This installs Spark, JupyterHub, and Unity Catalog. OpenMetadata is deployed separately:
./uis deploy postgresql # Required by OpenMetadata
./uis deploy elasticsearch # Required by OpenMetadata
./uis deploy openmetadata
How It Works
- JupyterHub gives users interactive notebooks with PySpark pre-configured
- Spark Operator runs batch jobs as Kubernetes-native SparkApplication resources
- Unity Catalog provides a three-level namespace (catalog.schema.table) for governed data access
- OpenMetadata provides data discovery, lineage tracking, and governance across all data assets