Integration SAP Datasphere : Utiliser les donnees ABAP Cloud dans l analytique

Catégorie
Integration
Publié
Auteur
Johannes

SAP Datasphere est la solution successeur de SAP Data Warehouse Cloud et constitue le coeur de la SAP Business Data Fabric. Pour les developpeurs ABAP Cloud, cela ouvre la possibilite de mettre a disposition des donnees transactionnelles pour des scenarios analytiques complexes.

Fondamentaux SAP Datasphere

SAP Datasphere reunit le Data Warehousing, le Data Lake, la virtualisation de donnees et la semantique metier dans une plateforme integree. L’architecture est basee sur des Spaces qui fonctionnent comme des zones de donnees isolees.

Concepts cles

ConceptDescription
SpaceEspace de travail isole pour les equipes
Data BuilderModelisation des vues et tables
Business BuilderCouche semantique et indicateurs
Data Access ControlsAutorisations au niveau des donnees
ConnectionsConnexions aux systemes sources

Integration avec ABAP Cloud

Les systemes ABAP Cloud peuvent etre connectes de differentes manieres :

  1. ABAP CDS Extraction : Extraction directe des vues CDS
  2. Services OData : Acces via les interfaces OData
  3. Remote Tables : Acces virtuel aux systemes SAP
  4. Replication Flows : Replication de donnees planifiee

Extraction de vues CDS pour Datasphere

L’extraction de vues CDS est la methode recommandee pour ABAP Cloud vers Datasphere. Les vues doivent etre annotees de maniere specifique.

Vues CDS compatibles extraction

@AbapCatalog.sqlViewName: 'ZSALESORDEREX"
@AbapCatalog.compiler.compareFilter: true
@Analytics.dataCategory: #CUBE
@Analytics.dataExtraction.enabled: true
@Analytics.dataExtraction.delta.changeDataCapture.automatic: true
define view entity ZI_SalesOrderExtract
as select from zsalesorder as so
association [1..1] to ZI_CustomerMaster as _Customer
on $projection.CustomerId = _Customer.CustomerId
{
@Analytics.dataExtraction.delta.byElement.name: 'ChangedOn"
@Analytics.dataExtraction.delta.byElement.maxDelayInSeconds: 1800
key so.sales_order_id as SalesOrderId,
so.customer_id as CustomerId,
so.order_date as OrderDate,
so.net_amount as NetAmount,
so.currency as Currency,
so.status as Status,
@Semantics.systemDateTime.lastChangedAt: true
so.changed_on as ChangedOn,
@Semantics.systemDateTime.createdAt: true
so.created_on as CreatedOn,
_Customer
}

Annotations importantes pour l’extraction

AnnotationDescription
@Analytics.dataExtraction.enabledActive l’extraction
@Analytics.dataExtraction.delta.changeDataCapture.automaticCDC automatique
@Analytics.dataExtraction.delta.byElement.nameChamp pour la detection delta
@Analytics.dataCategoryCategorisation (#CUBE, #DIMENSION, #FACT)

Vues CDS analytiques pour Datasphere

Pour une analytique optimale, les vues CDS doivent etre modelisees selon le schema en etoile.

Vue Dimension (Client)

@AbapCatalog.viewEnhancementCategory: [#NONE]
@Analytics.dataCategory: #DIMENSION
@Analytics.dataExtraction.enabled: true
@ObjectModel.representativeKey: 'CustomerId"
define view entity ZI_CustomerDimension
as select from zcustomer
{
key customer_id as CustomerId,
@Semantics.text: true
customer_name as CustomerName,
country as Country,
region as Region,
city as City,
customer_type as CustomerType,
@Semantics.businessDate.createdAt: true
created_on as CreatedOn
}

Vue Fact (Chiffres d’affaires)

@AbapCatalog.viewEnhancementCategory: [#NONE]
@Analytics.dataCategory: #FACT
@Analytics.dataExtraction.enabled: true
@Analytics.dataExtraction.delta.changeDataCapture.automatic: true
define view entity ZI_SalesFactExtract
as select from zsales_data as sd
association [1..1] to ZI_CustomerDimension as _Customer
on $projection.CustomerId = _Customer.CustomerId
association [1..1] to ZI_ProductDimension as _Product
on $projection.ProductId = _Product.ProductId
association [1..1] to ZI_TimeDimension as _Time
on $projection.SalesDate = _Time.CalendarDate
{
key sd.sales_id as SalesId,
sd.customer_id as CustomerId,
sd.product_id as ProductId,
sd.sales_date as SalesDate,
@Semantics.amount.currencyCode: 'Currency"
@DefaultAggregation: #SUM
sd.revenue as Revenue,
@DefaultAggregation: #SUM
sd.quantity as Quantity,
@DefaultAggregation: #SUM
sd.discount as Discount,
sd.currency as Currency,
@Semantics.systemDateTime.lastChangedAt: true
sd.changed_at as ChangedAt,
_Customer,
_Product,
_Time
}

Vue Cube (Agregation)

@AbapCatalog.viewEnhancementCategory: [#NONE]
@Analytics.dataCategory: #CUBE
@Analytics.dataExtraction.enabled: true
define view entity ZI_SalesAnalyticsCube
as select from ZI_SalesFactExtract as Fact
association [1..1] to ZI_CustomerDimension as _Customer
on $projection.CustomerId = _Customer.CustomerId
association [1..1] to ZI_ProductDimension as _Product
on $projection.ProductId = _Product.ProductId
{
key Fact.SalesId,
-- Dimensions
@ObjectModel.foreignKey.association: '_Customer"
Fact.CustomerId,
@ObjectModel.foreignKey.association: '_Product"
Fact.ProductId,
@AnalyticsDetails.query.axis: #ROWS
Fact.SalesDate,
-- Indicateurs
@AnalyticsDetails.query.axis: #COLUMNS
@Aggregation.default: #SUM
Fact.Revenue,
@Aggregation.default: #SUM
Fact.Quantity,
@Aggregation.default: #AVG
Fact.Discount,
Fact.Currency,
-- Associations
_Customer,
_Product
}

Configurer l’extraction delta

L’extraction delta ne transfere que les donnees modifiees apres le chargement initial. Cela economise des ressources et reduit les latences.

Extraction delta basee sur horodatage

@Analytics.dataExtraction.enabled: true
@Analytics.dataExtraction.delta.byElement: {
name: 'ChangedAt',
maxDelayInSeconds: 3600,
ignoreDeletions: false
}
define view entity ZI_MaterialWithDelta
as select from zmaterial
{
key material_id as MaterialId,
material_name as MaterialName,
base_unit as BaseUnit,
material_type as MaterialType,
@Semantics.systemDateTime.lastChangedAt: true
changed_at as ChangedAt,
deletion_flag as DeletionFlag
}

Change Data Capture (CDC)

Pour une replication proche du temps reel, le CDC automatique peut etre active :

@Analytics.dataExtraction.enabled: true
@Analytics.dataExtraction.delta.changeDataCapture: {
automatic: true,
rowStatus: 'DELTA_STATUS"
}
define view entity ZI_OrderWithCDC
as select from zorder
association [0..*] to ZI_OrderItemWithCDC as _Items
on $projection.OrderId = _Items.OrderId
{
key order_id as OrderId,
customer_id as CustomerId,
order_date as OrderDate,
order_status as OrderStatus,
@Semantics.systemDateTime.lastChangedAt: true
changed_at as ChangedAt,
-- Champ de statut CDC
delta_status as DeltaStatus,
_Items
}

Valeurs du champ CDC

ValeurSignification
' ' (vide)Inchange
'U'Update
'I'Insert
'D'Delete

Scenario delta complexe

@Analytics.dataExtraction.enabled: true
@Analytics.dataExtraction.delta.changeDataCapture.automatic: true
define view entity ZI_InvoiceExtract
as select from zinvoice as inv
inner join zcustomer as cust
on inv.customer_id = cust.customer_id
{
key inv.invoice_id as InvoiceId,
inv.customer_id as CustomerId,
cust.customer_name as CustomerName,
inv.invoice_date as InvoiceDate,
inv.total_amount as TotalAmount,
inv.currency as Currency,
inv.payment_status as PaymentStatus,
-- Champs delta
@Semantics.systemDateTime.lastChangedAt: true
case
when inv.changed_at > cust.changed_at
then inv.changed_at
else cust.changed_at
end as ChangedAt,
-- Indicateur de suppression calcule
case
when inv.deletion_flag = 'X' or cust.deletion_flag = 'X"
then 'X"
else '"
end as DeletionIndicator
}

Autorisations pour l’extraction

L’extraction necessite des autorisations speciales tant dans le systeme ABAP que dans Datasphere.

Autorisations cote ABAP

CLASS zcl_datasphere_auth DEFINITION
PUBLIC
CREATE PUBLIC.
PUBLIC SECTION.
INTERFACES if_badi_interface.
METHODS check_extraction_auth
IMPORTING
iv_view_name TYPE sxco_cds_object_name
RETURNING
VALUE(rv_authorized) TYPE abap_bool.
ENDCLASS.
CLASS zcl_datasphere_auth IMPLEMENTATION.
METHOD check_extraction_auth.
" Verification de l'autorisation d'extraction
DATA: lv_auth_check TYPE abap_bool.
" Verifier l'objet d'autorisation
AUTHORITY-CHECK OBJECT 'Z_DS_EXTR"
ID 'Z_VIEWNAME' FIELD iv_view_name
ID 'ACTVT' FIELD '03'. " Afficher
IF sy-subrc = 0.
rv_authorized = abap_true.
ELSE.
rv_authorized = abap_false.
ENDIF.
ENDMETHOD.
ENDCLASS.

Definition de l’objet d’autorisation

" Objet d'autorisation Z_DS_EXTR pour l'extraction Datasphere
" Champs :
" - Z_VIEWNAME : Nom de la vue CDS (CHAR 40)
" - ACTVT : Activite (01=Creer, 02=Modifier, 03=Afficher)

Vues CDS avec controle d’autorisation

@AccessControl.authorizationCheck: #CHECK
@Analytics.dataExtraction.enabled: true
define view entity ZI_ConfidentialSalesData
as select from zsales_data
{
key sales_id as SalesId,
customer_id as CustomerId,
region as Region,
revenue as Revenue,
profit_margin as ProfitMargin,
currency as Currency,
changed_at as ChangedAt
}

DCL associee (Data Control Language)

@EndUserText.label: 'Autorisation donnees ventes"
@MappingRole: true
define role ZI_ConfidentialSalesData {
grant select on ZI_ConfidentialSalesData
where ( Region ) = aspect pfcg_auth(
Z_SD_REGN,
Z_REGION,
ACTVT = '03"
);
}

Utilisateur technique pour l’extraction

Pour l’extraction automatique, un utilisateur technique est necessaire :

CLASS zcl_datasphere_extraction_user DEFINITION
PUBLIC
FINAL
CREATE PUBLIC.
PUBLIC SECTION.
CLASS-METHODS get_technical_user
RETURNING
VALUE(rv_user) TYPE syuname.
CLASS-METHODS validate_technical_user
IMPORTING
iv_user TYPE syuname
RETURNING
VALUE(rv_valid) TYPE abap_bool.
ENDCLASS.
CLASS zcl_datasphere_extraction_user IMPLEMENTATION.
METHOD get_technical_user.
" Utilisateur technique pour l'extraction Datasphere
rv_user = 'DATASPHERE_EXTRACT'.
ENDMETHOD.
METHOD validate_technical_user.
" Verification si l'utilisateur existe et est actif
SELECT SINGLE bname FROM usr02
WHERE bname = @iv_user
AND uflag = 0 " Non verrouille
INTO @DATA(lv_user).
rv_valid = xsdbool( sy-subrc = 0 ).
ENDMETHOD.
ENDCLASS.

Live Data Connection vs. Replication

SAP Datasphere offre deux modes d’integration fondamentaux qui couvrent differents cas d’utilisation.

Live Data Connection (Virtualisation)

Avec la Live Data Connection, les donnees ne sont pas copiees mais recuperees directement du systeme source a l’execution.

Avantages :

  • Donnees toujours actuelles
  • Pas de consommation de stockage supplementaire
  • Pas de logique de replication necessaire

Inconvenients :

  • Latence plus elevee lors des requetes
  • Charge sur le systeme source
  • Possibilites de transformation limitees

Cas d’utilisation :

  • Donnees de base avec faible volume de modifications
  • Requetes ad-hoc
  • Prototypage et exploration

Replication (Materialisation)

Avec la replication, les donnees sont physiquement copiees vers Datasphere et y sont persistees.

Avantages :

  • Haute performance des requetes
  • Pas de charge sur le systeme source
  • Transformations complexes possibles
  • Historisation possible

Inconvenients :

  • Donnees pas en temps reel
  • Consommation de stockage supplementaire
  • Logique de replication necessaire

Cas d’utilisation :

  • Donnees de mouvements avec volume eleve
  • Requetes analytiques complexes
  • Reporting avec cycles de rafraichissement definis

Matrice de decision

CritereLive DataReplication
Actualite des donneesTemps reelDiffere (minutes a heures)
PerformanceMoyenneElevee
Charge systeme sourceEleveeFaible
Volume de donneesPetit a moyenIllimite
TransformationsLimiteesEtendues

Scenario hybride

-- Donnees de base : Live Connection (toujours actuelles)
@Analytics.dataExtraction.enabled: true
@Analytics.dataExtraction.preferredMode: #VIRTUAL
define view entity ZI_CustomerMasterLive
as select from zcustomer
{
key customer_id as CustomerId,
customer_name as CustomerName,
customer_type as CustomerType,
credit_limit as CreditLimit,
currency as Currency
}
-- Donnees de mouvements : Replication (haute performance)
@Analytics.dataExtraction.enabled: true
@Analytics.dataExtraction.preferredMode: #REPLICATED
@Analytics.dataExtraction.delta.changeDataCapture.automatic: true
define view entity ZI_SalesHistoryReplicated
as select from zsales_history
{
key sales_id as SalesId,
customer_id as CustomerId,
sales_date as SalesDate,
amount as Amount,
currency as Currency,
changed_at as ChangedAt
}

Configuration dans Datasphere

Configuration de la connexion

Dans le Data Builder de Datasphere, la connexion au systeme ABAP est configuree :

  1. Connection Type : SAP ABAP
  2. System ID : Identification du systeme (par ex. S4P)
  3. Host : Nom d’hote du systeme ABAP
  4. Client : Mandant
  5. User/Password : Utilisateur technique

Configuration Remote Table

Parametres Remote Table :
- Source Object : ZI_SalesFactExtract
- Load Type : Initial + Delta
- Delta Frequency : Toutes les 15 minutes
- Memory : Large
- Partitioning : Par SalesDate (mensuel)

Definition Data Flow

Pour des transformations complexes, des Data Flows peuvent etre definis :

Source : ZI_SalesFactExtract (Remote Table)
→ Filtre : SalesDate >= '20240101"
→ Agregation : SUM(Revenue) GROUP BY CustomerId, Month
→ Enrichissement : LEFT JOIN CustomerMaster
→ Cible : Sales_Monthly_Summary (Local Table)

Optimisation des performances

Conception de vues CDS pour l’extraction

-- Optimise pour l'extraction
@AbapCatalog.sqlViewAppendage: {
forSourceViewName: 'ZSALESOPT"
}
@Analytics.dataExtraction.enabled: true
@Analytics.dataExtraction.delta.byElement.name: 'ChangedAt"
define view entity ZI_SalesOptimized
as select from zsales
{
key sales_id as SalesId,
-- Cle de partitionnement en premier
@Analytics.dataExtraction.partitionBy: true
sales_date as SalesDate,
-- Champs filtrables indexes
@Analytics.dataExtraction.filter: #RECOMMENDED
region as Region,
customer_id as CustomerId,
amount as Amount,
currency as Currency,
@Semantics.systemDateTime.lastChangedAt: true
changed_at as ChangedAt
}

Tailles de lot et parallelisation

CLASS zcl_extraction_config DEFINITION
PUBLIC
FINAL
CREATE PUBLIC.
PUBLIC SECTION.
CONSTANTS:
c_batch_size TYPE i VALUE 50000,
c_max_parallel_jobs TYPE i VALUE 4,
c_delta_interval TYPE i VALUE 900. " 15 minutes
CLASS-METHODS get_extraction_params
IMPORTING
iv_view_name TYPE sxco_cds_object_name
RETURNING
VALUE(rs_params) TYPE zcl_extraction_config=>ts_params.
TYPES:
BEGIN OF ts_params,
batch_size TYPE i,
parallel_jobs TYPE i,
delta_interval TYPE i,
use_compression TYPE abap_bool,
END OF ts_params.
ENDCLASS.
CLASS zcl_extraction_config IMPLEMENTATION.
METHOD get_extraction_params.
" Configuration specifique a la vue
rs_params = VALUE #(
batch_size = c_batch_size
parallel_jobs = c_max_parallel_jobs
delta_interval = c_delta_interval
use_compression = abap_true
).
" Ajustement pour les grandes vues
IF iv_view_name CS 'HISTORY' OR iv_view_name CS 'ARCHIVE'.
rs_params-batch_size = 100000.
rs_params-parallel_jobs = 8.
ENDIF.
ENDMETHOD.
ENDCLASS.

Monitoring et depannage

Surveiller l’extraction

CLASS zcl_extraction_monitor DEFINITION
PUBLIC
FINAL
CREATE PUBLIC.
PUBLIC SECTION.
METHODS get_extraction_status
IMPORTING
iv_view_name TYPE sxco_cds_object_name
RETURNING
VALUE(rs_status) TYPE zcl_extraction_monitor=>ts_status.
TYPES:
BEGIN OF ts_status,
view_name TYPE sxco_cds_object_name,
last_run TYPE timestampl,
records_total TYPE i,
records_delta TYPE i,
runtime_ms TYPE i,
status TYPE c LENGTH 10,
error_message TYPE string,
END OF ts_status.
ENDCLASS.
CLASS zcl_extraction_monitor IMPLEMENTATION.
METHOD get_extraction_status.
" Exemple de requete de statut
SELECT SINGLE *
FROM zextraction_log
WHERE view_name = @iv_view_name
ORDER BY timestamp DESCENDING
INTO @DATA(ls_log).
IF sy-subrc = 0.
rs_status = VALUE #(
view_name = ls_log-view_name
last_run = ls_log-timestamp
records_total = ls_log-records_total
records_delta = ls_log-records_delta
runtime_ms = ls_log-runtime_ms
status = ls_log-status
error_message = ls_log-error_message
).
ENDIF.
ENDMETHOD.
ENDCLASS.

Erreurs frequentes

ErreurCauseSolution
No extraction enabledAnnotation manquanteAjouter @Analytics.dataExtraction.enabled: true
Delta field not foundMauvais nom de champVerifier le nom du champ dans byElement.name
Authorization failedAutorisation manquanteAutoriser l’utilisateur technique
Timeout during extractionVolume de donnees trop importantReduire la taille de lot

Bonnes pratiques

  1. Modelisation en schema en etoile : Separer clairement les vues Fact et Dimension
  2. Toujours activer le delta : Reduit la charge et le volume de transfert
  3. Definir les autorisations de maniere granulaire : N’exposer que les donnees necessaires
  4. Tester les performances : Mesurer les temps d’extraction avant Go-Live
  5. Configurer le monitoring : Detecter rapidement les extractions defaillantes
  6. Maintenir la documentation : Documenter les vues et leur utilisation

Resume

AspectRecommandation
Donnees de baseLive Data Connection
Donnees de mouvementsReplication avec delta
AutorisationsCombiner DCL + PFCG
PerformancePartitionnement + Index
MonitoringVerification reguliere du statut

L’integration d’ABAP Cloud avec SAP Datasphere ouvre de puissantes possibilites analytiques. Avec les bonnes annotations, strategies delta et concepts d’autorisation, un approvisionnement en donnees robuste et performant peut etre realise.

Sujets connexes