sap-hana-ml

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

SAP HANA ML Python Client (hana-ml)

SAP HANA ML Python客户端（hana-ml）

Package Version: 2.22.241011
Last Verified: 2025-11-27

包版本：2.22.241011
最后验证时间：2025-11-27

Installation & Setup

安装与配置

bash

pip install hana-ml

Requirements: Python 3.8+, SAP HANA 2.0 SPS03+ or SAP HANA Cloud

bash

pip install hana-ml

要求：Python 3.8+、SAP HANA 2.0 SPS03+ 或 SAP HANA Cloud

Quick Start

快速入门

Connection & DataFrame

连接与DataFrame

python

from hana_ml import ConnectionContext

python

from hana_ml import ConnectionContext

Connect

conn = ConnectionContext( address='<hostname>', port=443, user='<username>', password='<password>', encrypt=True )

Create DataFrame

df = conn.table('MY_TABLE', schema='MY_SCHEMA') print(f"Shape: {df.shape}") df.head(10).collect()

undefined

df = conn.table('MY_TABLE', schema='MY_SCHEMA') print(f"Shape: {df.shape}") df.head(10).collect()

undefined

PAL Classification

PAL分类

python

from hana_ml.algorithms.pal.unified_classification import UnifiedClassification

python

from hana_ml.algorithms.pal.unified_classification import UnifiedClassification

Train model

clf = UnifiedClassification(func='RandomDecisionTree') clf.fit(train_df, features=['F1', 'F2', 'F3'], label='TARGET')

Predict & evaluate

predictions = clf.predict(test_df, features=['F1', 'F2', 'F3']) score = clf.score(test_df, features=['F1', 'F2', 'F3'], label='TARGET')

undefined

predictions = clf.predict(test_df, features=['F1', 'F2', 'F3']) score = clf.score(test_df, features=['F1', 'F2', 'F3'], label='TARGET')

undefined

APL AutoML

python

from hana_ml.algorithms.apl.classification import AutoClassifier

python

from hana_ml.algorithms.apl.classification import AutoClassifier

Automated classification

auto_clf = AutoClassifier() auto_clf.fit(train_df, label='TARGET') predictions = auto_clf.predict(test_df)

undefined

auto_clf = AutoClassifier() auto_clf.fit(train_df, label='TARGET') predictions = auto_clf.predict(test_df)

undefined

Model Persistence

模型持久化

python

from hana_ml.model_storage import ModelStorage

ms = ModelStorage(conn)
clf.name = 'MY_CLASSIFIER'
ms.save_model(model=clf, if_exists='replace')

python

from hana_ml.model_storage import ModelStorage

ms = ModelStorage(conn)
clf.name = 'MY_CLASSIFIER'
ms.save_model(model=clf, if_exists='replace')

Core Libraries

核心库

PAL (Predictive Analysis Library)

PAL（预测分析库）

100+ algorithms executed in-database
Categories: Classification, Regression, Clustering, Time Series, Preprocessing

Key classes:

UnifiedClassification

UnifiedRegression

KMeans

ARIMA

See:
```
references/PAL_ALGORITHMS.md
```
for complete list

100+算法，在库内执行
类别：分类、回归、聚类、时间序列、预处理

核心类：

UnifiedClassification

、

UnifiedRegression

、

KMeans

、

ARIMA

详见：
```
references/PAL_ALGORITHMS.md
```
获取完整列表

APL (Automated Predictive Library)

APL（自动化预测库）

AutoML capabilities with automatic feature engineering

Key classes:

AutoClassifier

AutoRegressor

GradientBoostingClassifier

See:
```
references/APL_ALGORITHMS.md
```
for details

AutoML能力，支持自动特征工程

核心类：

AutoClassifier

、

AutoRegressor

、

GradientBoostingClassifier

详见：
```
references/APL_ALGORITHMS.md
```
获取详情

DataFrames

Lazy evaluation - builds SQL until
```
collect()
```
called
In-database processing for optimal performance
See:
```
references/DATAFRAME_REFERENCE.md
```
for complete API

延迟计算 - 直到调用
```
collect()
```
才生成SQL
库内处理，实现最优性能
详见：
```
references/DATAFRAME_REFERENCE.md
```
获取完整API

Visualizers

可视化工具

EDA plots, model explanations, metrics
SHAP integration for model interpretability
See:
```
references/VISUALIZERS.md
```
for 14 visualization modules

探索性数据分析（EDA）图表、模型解释、指标展示
SHAP集成，提升模型可解释性
详见：
```
references/VISUALIZERS.md
```
了解14个可视化模块

Common Patterns

常见模式

Train-Test Split

训练-测试-验证集拆分

python

from hana_ml.algorithms.pal.partition import train_test_val_split

train, test, val = train_test_val_split(
    data=df,
    training_percentage=0.7,
    testing_percentage=0.2,
    validation_percentage=0.1
)

python

from hana_ml.algorithms.pal.partition import train_test_val_split

train, test, val = train_test_val_split(
    data=df,
    training_percentage=0.7,
    testing_percentage=0.2,
    validation_percentage=0.1
)

Feature Importance

特征重要性

python

undefined

python

undefined

APL models

importance = auto_clf.get_feature_importances()

PAL models

from hana_ml.algorithms.pal.preprocessing import FeatureSelection fs = FeatureSelection() fs.fit(train_df, features=features, label='TARGET')

undefined

from hana_ml.algorithms.pal.preprocessing import FeatureSelection fs = FeatureSelection() fs.fit(train_df, features=features, label='TARGET')

undefined

Pipeline

流水线

python

from hana_ml.algorithms.pal.pipeline import Pipeline
from hana_ml.algorithms.pal.preprocessing import Imputer, FeatureNormalizer

pipeline = Pipeline([
    ('imputer', Imputer(strategy='mean')),
    ('normalizer', FeatureNormalizer()),
    ('classifier', UnifiedClassification(func='RandomDecisionTree'))
])

python

from hana_ml.algorithms.pal.pipeline import Pipeline
from hana_ml.algorithms.pal.preprocessing import Imputer, FeatureNormalizer

pipeline = Pipeline([
    ('imputer', Imputer(strategy='mean')),
    ('normalizer', FeatureNormalizer()),
    ('classifier', UnifiedClassification(func='RandomDecisionTree'))
])

Best Practices

最佳实践

Use lazy evaluation - Operations build SQL without execution until
```
collect()
```
Leverage in-database processing - Keep data in HANA for performance
Use Unified interfaces - Consistent APIs across algorithms
Save models - Use
```
ModelStorage
```
for persistence
Explain predictions - Use SHAP explainers for interpretability
Monitor AutoML - Use
```
PipelineProgressStatusMonitor
```
for long-running jobs

使用延迟计算 - 操作仅生成SQL，直到调用
```
collect()
```
才执行
利用库内处理 - 数据保留在HANA中以提升性能
使用统一接口 - 所有算法采用一致的API
保存模型 - 使用
```
ModelStorage
```
实现持久化
解释预测结果 - 使用SHAP解释器提升模型可解释性
监控AutoML任务 - 对长时间运行的任务使用
```
PipelineProgressStatusMonitor
```

Bundled Resources

配套资源

Reference Files

参考文档

references/DATAFRAME_REFERENCE.md
(479 lines)
- ConnectionContext API, DataFrame operations, SQL generation
references/PAL_ALGORITHMS.md
(869 lines)
- Complete PAL algorithm reference (100+ algorithms)
- Classification, Regression, Clustering, Time Series, Preprocessing
references/APL_ALGORITHMS.md
(534 lines)
- AutoML capabilities, automated feature engineering
- AutoClassifier, AutoRegressor, GradientBoosting classes
references/VISUALIZERS.md
(704 lines)
- 14 visualization modules (EDA, SHAP, metrics, time series)
- Plot types, configuration, export options
references/SUPPORTING_MODULES.md
(626 lines)
- Model storage, spatial analytics, graph algorithms
- Text mining, statistics, error handling

references/DATAFRAME_REFERENCE.md
（479行）
- ConnectionContext API、DataFrame操作、SQL生成
references/PAL_ALGORITHMS.md
（869行）
- 完整PAL算法列表（100+算法）
- 分类、回归、聚类、时间序列、预处理
references/APL_ALGORITHMS.md
（534行）
- AutoML能力、自动特征工程
- AutoClassifier、AutoRegressor、GradientBoosting类
references/VISUALIZERS.md
（704行）
- 14个可视化模块（EDA、SHAP、指标、时间序列）
- 图表类型、配置、导出选项
references/SUPPORTING_MODULES.md
（626行）
- 模型存储、空间分析、图算法
- 文本挖掘、统计、错误处理

Error Handling

错误处理

python

from hana_ml.ml_exceptions import Error

try:
    clf.fit(train_df, features=features, label='TARGET')
except Error as e:
    print(f"HANA ML Error: {e}")

python

from hana_ml.ml_exceptions import Error

try:
    clf.fit(train_df, features=features, label='TARGET')
except Error as e:
    print(f"HANA ML Error: {e}")

Documentation

文档

Official Docs: https://help.sap.com/doc/1d0ebfe5e8dd44d09606814d83308d4b/2.0.07/en-US/hana_ml.html
PyPI Package: https://pypi.org/project/hana-ml/

官方文档：https://help.sap.com/doc/1d0ebfe5e8dd44d09606814d83308d4b/2.0.07/en-US/hana_ml.html
PyPI包：https://pypi.org/project/hana-ml/

sap-hana-ml

Original

Translation

SAP HANA ML Python Client (hana-ml)

SAP HANA ML Python客户端（hana-ml）

Table of Contents

目录

Installation & Setup

安装与配置

Quick Start

快速入门

Connection & DataFrame

连接与DataFrame

Connect

Connect

Create DataFrame

Create DataFrame

PAL Classification

PAL分类

Train model

Train model

Predict & evaluate

Predict & evaluate

APL AutoML

APL AutoML

Automated classification

Automated classification

Model Persistence

模型持久化

Core Libraries

核心库

PAL (Predictive Analysis Library)

PAL（预测分析库）

APL (Automated Predictive Library)

APL（自动化预测库）

DataFrames

DataFrames

Visualizers

可视化工具

Common Patterns

常见模式

Train-Test Split

训练-测试-验证集拆分

Feature Importance

特征重要性

APL models

APL models

PAL models

PAL models

Pipeline

流水线

Best Practices

最佳实践

Bundled Resources

配套资源

Reference Files

参考文档

Error Handling

错误处理

Documentation

文档