2. Ingest data via Source Connector
In the last exercise we onboarded CRM data to our Dataset manually via the UI uploader. In a production deployment this process will likely be automated using batch APIs provided by the Platform or the many Source Connectors available in Experience Platform.
Experience Platform provides a RESTful API and an interactive UI that lets you set-up source connections to various data providers with ease. These source connections enable you to authenticate your storage systems and CRM services, set times for ingestion runs, and manage data ingestion throughput.
Let's see how we can onboard data using a source connector. In this exercise we will onboard call center data stored in our Amazon Web Services S3 bucket.
- Select Sources from under Connections in the left hand navigation.
- Click on the Catalog tab.
- You will see a list of all Source Connectors. Search for Amazon S3 and select Add Data.
We've already set up a Connection to an S3 Bucket for the purpose of this Lab.
- Select Luma Demo Data Source [Labs] from the Accounts list.
- Select Next.
Now we need to select the CSV file we want to onboard.
- Select the file [T] Retail - Call Centre Interactions from the list on the left.
- Choose Delimited for the file format from the drop down.
You will be shown a preview of the data so you can validate the Adobe Experience Platform is interpreting the data in the file correctly.
Click Next.
We now need to specify the Dataset where this file will be ingested:
- From the dropdown list under Target dataset, select Demo System - Event Dataset for Call Center (Global v1.1) from the Dataset details drop down.
- Under Dataflow details prefix the Dataflow name with your own name so you can find this later.
- Leave the other settings as the default.
- Click Next.
With your CSV file ready, you can proceed with mapping it to the corresponding fields in the call center XDM Schema. TBD
As we saw in the last exercise when we ingested data via the UI, we did not need to do this mapping step. This was becuase the format of the JSON structure in the file was already comliant with the XDM schema. Whereas in this example, the CSV file has a different structure that does not confirm to the XDM Schema, so we must map the column headers in the CSV to the fields in the XDM Schema before ingestion.
The goal of this is to onboard call center data in Platform. Most of the data that is ingested in Platform should be mapped against a specific XDM Schema. What you currently have is a CSV dataset on one side, and a dataset that is linked to a schema on the other side. To load that CSV file in that dataset, a mapping needs to take place. To facilitate this mapping exercise, we have Workflows available in Adobe Experience Platform.
You should now be on the mapping step of the Workflow to onboard the data from S3.
You now need to map your CSV Column Headers with an XDM-property in your Dataset.
NOTE: for the Schema Mappings, Adobe Experience Platform has linked fields together already. However, in the case not all proposals of mapping are correct, you can adjust the individual mappings in the UI.
To edit the mappings:
- Select the lightbulb icon on the Target field you want to change.
- If your chosen field doesn't appear in the list, choose Select manually.
- Select your chosen field from the schema structure.
- Your source attribute is now mapped to your chosen Target filed.
- You can also type your Target field directly in the text box if you know the path.
For this exercise you will not need to edit any of the mappings. Instead will choose to Import Mapping from a previous ingestion.
- Select Import mapping from the menu above the mappings near the center of the screen.
- This will open a window with mappings from previous Dataflows.
- Select the Dataflow Call Center Dataflow v1.
- Click Select.
Your final mapping should be:
- Click Next, you'll then see the Scheduling options available for this Source Connector.
- Keep the default settings of Once, and click Next.
- On the final Review step, click Finish to onboard your data.
You'll then be directed to the Sources overview, where you will see the list of the Dataflows configured for this source.
- Find yours and select the Target Dataset to view the Dataset Activity.
- You'll then see the Dataset Overview where your ingestion has processed, and after a couple of minutes you can refresh your screen to see if your workflow completed successfully.
On the Dataset activity screen, you'll see a Batch ID that has been ingested just now, with 30 records ingested and a status of Success.
- Click on the Preview Dataset button to get a quick view of a small sample of the data ingested to ensure that it loaded correctly.
Once data is loaded, you can define the correct data governance approach for your dataset. We will do this in a later exercise.
With this, you’ve now successfully ingested and classified CRM Data in Adobe Experience Platform!