Technical article: Queries in the Adobe Experience Platform (AEP)
Artur Rutkiewicz, Digital Analytics & Targeting Consultant at diva-e, originally comes from data analytics and onsite personalization & testing. Due to the close professional relationship, the familiarization with the new Adobe Experience Platform (AEP), as Adobe's CDP solution, was almost fated. I want to share my experiences with you.
In mid-2020, I started learning the new Adobe Experience Platform (AEP). diva-e was the first Regional Partner ever to be provided with a sandbox, which I could familiarize myself with right away. From the numerous training and documentation materials, Adobe kept referring me to their architecture diagram, which helped me with many more questions:
Source: Adobe Experience League
Query service in the AEP
From the beginning, I was interested in query service in the AEP, and I went to the limit of this feature. For those unfamiliar with the quality, Adobe uses the term queries to describe an integrated interface that can be used to query AEP datasets via SQL statements. Usually, standard SQL is used, but a few Adobe-specific comments are described in the constantly expanding documentation.
The nice thing about these queries is that they are executed directly in the tool, i.e. close to the data source, and thus no export of the data or statements is necessary. Of course, a CDP quickly raises the question of which data the query service accesses - after all, the purpose of a CDP is to stitch different data sources using standard identifiers and thus to form customer profiles. Adobe's architecture diagram comes into play again: The query service queries data at the data level and cannot access the (virtually) created real-time customer profile.
If one wants to have this, a reconstruction via the known identifiers is necessary. Sounds complicated? It all depends on the number of datasets and can become challenging very quickly. In my opinion, the Datascience environment of the AEP (Jupyter notebooks) is more suitable here. But I guess that's a matter of taste since both solutions access the same data sources.
The purpose of the query service
But back to the actual purpose of the query service. At diva-e, we were fortunate to get our first customer excited about the AEP very quickly after learning about the sandbox. In the course of the project, it turned out to be helpful again and again to query a dataset precisely and rapidly, for example, to perform a QA during the implementation of the Web SDK or to support the design of a segment. Fast is relative: the query service is built on a Postgres infrastructure, making queries a bit sluggish. Many a time, I've run into the query timeout of 10 minutes. Did I say Postgres? That's right. Adobe offers the ability to connect the query service to its own PostgreSQL infrastructure. We tried this with Tableau's built-in connector, and in RStudio, for example, and voila - we can already query, save, visualize, and more data outside of AEP. Read more here.
Creating a query schema
Was that all? Of course, just querying data is all well and good, but only valid for ad hoc questions. Adobe does offer the ability to store queries in a dataset in the UI. However, this creates a so-called "ad-hoc" dataset based on an automatically generated schema (based on Adobe XDM) and cannot be activated for profiles. This is a pity and often not helpful, but of course, it can be worked around. First of all, what is needed is an already created schema (activated or at least activatable for profiles). With
CREATE TABLE <dataset_name> WITH (schema='<schema_name>', rowvalidation='true')
and the corresponding statement, a new dataset is created with the currently queried data on an existing schema. If one would like to fill an existing dataset the
INSERT INTO <dataset_name>
command helps. However, it took me some time to find the right command for filling objects, which frequently occur in AEP schemas. With
STRUCT(<statement>) AS <object_name>
I finally found the solution. A standard dataset is now filled with the results of a query and can be activated for profiles. And if you want to schedule the whole thing? Then a look into the API helps because questions can be designed via it. More about that here.
You have to deal with the "cron expression format" a bit - this can be tricky initially, but like with regex, you will get there in the end. It is also important to note that within the query statement in the API payload, you should not use quotes but replace them with a single apostrophe (').
Queries in the Adobe Experience Platform are a powerful tool to quickly get the information in the datasets on the one hand. In addition, they offer the possibility to use the data available in the AEP for visualizations and even calculations. Especially the last point shows how open the platform is designed and provides a comprehensive toolset to achieve its goal.
On-Demand Webinar: Adobe Experience Platform (AEP) & Informatica Data Integration Cloud
To ensure high data quality in this process, an ETL tool can be helpful. Informatica Intelligent Cloud Services' Data Integration Services provides a toolset aligned with AEP, which I present in my on-demand webinar. Watch it here anytime for free.
diva-e Adobe expertise from the leading digital partner
We're also constantly expanding our web presence around Adobe Experience Cloud and our long-standing partnership with the latest information, services, and helpful tips, success stories, and webinars for our customers.
Want to learn more about our complete services, expertise, and projects around Adobe? Feel free to check out our Adobe Library for suitable webinars and learn about everything else.