Updating Custom Metadata for NoSQL Sources
Deprecation Notice
The Upload Logical Metadata (ULM) API will be deprecated in the 2024.3 release of Alation in early September. We recommend you move as soon as possible to the alternatives listed in the API Release Notes.
Overview
NoSQL objects are becoming more and more popular thus increasing the need for governance and documenting curating this data. In this guide, we will be going through how to update custom metadata fields via API.
Environment Setup
We will be using the below NoSQL example with the following custom fields enabled on the templates NoSQL Database
, NoSQL Collection
, NoSQL Attribute
:
- Line of Business (Multi-Picker)
- Finance
- Operations
- Marketing
- Research
- Notes (Rich Text Field)
- Contains PII (Picker)
- Yes
- No
- Review
Permissions Needed
Please consult APIs by Roles for the latest information.
Additional Information
Please see ULM: Key Formatting and ULM: Custom Field Formatting for additional details before starting this guide.
VS for NoSQL Databases is another great source for creating a NoSQL db in Alation.
NoSQL Structure
When reviewing the ULM: Key Formatting we can see there are 3 major object types,
- docstore_folder
- docstore_collection
- doc_schema
In the below image, we can see how this breaks down.
Updating Docstore Folder
Here's a quick example of what the docstore folder Products
looks like in Alation.
Let's now use the Upload Logical Metadata to update the custom fields here.
import requests
import json
alation_token = "CHANGEME"
base_url = "CHANGEME" # no trailing slash here
# note this API does not want a trailing slash, "/", after the object type.
# Otherwise, it will not work as expected
api = "/api/v1/bulk_metadata/custom_fields/default/{}"
object_type = "docstore_folder"
# replace_values = true will allow us to update the values if there is one present
# if set to false, then it will only update the custom fields if the values are not
# previously set. create_new=false, although this won't impact this otype
params = "?replace_values=true&create_new=false"
# here we are adding the object type variable to the url dynamically
url = base_url + api.format(object_type) + params
# note that we need to dump this to one line with no "\n", aka new line, in the payload
payload = json.dumps({
"key": "281.Products",
"Contains PII": "Review",
"Line of Business": [
"Finance",
"Operations"
],
"Notes": "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Integer ac turpis id metus semper tempus. Mauris hendrerit ex eu sagittis varius. Proin placerat massa elit, vitae posuere eros efficitur a. Curabitur in massa congue, dapibus leo vel, porta enim. Maecenas vitae laoreet diam. Nulla finibus varius mi in posuere. Vestibulum sollicitudin, odio fringilla consectetur porttitor, lectus orci semper tortor, vel luctus felis elit vel lorem. Cras eleifend sagittis urna id dignissim. Duis sed congue magna. Donec pharetra justo quis enim gravida, et fermentum nisl pharetra. Fusce sed porttitor sem. Aliquam aliquam a libero et varius. "
})
headers = {
'token': 'Ro37-ngigc53-5oRKgJE3bqYuAgw4i9vqiIK3EXel6k',
}
response = requests.post(url, headers=headers, data=payload)
print(response.json())
"""
response from Alation should look like...
{
"number_received": 1,
"updated_objects": 1,
"new_objects": 0,
"error_objects": [],
"error": ""
}
"""
Trust but verify. Let's look at this object in Alation now.
Updating Docstore Collection
# this code is a continuation of the previous api call.
# changing the object type
object_type = "docstore_collection"
# here we are adding the object type variable to the url dynamically
url = base_url + api.format(object_type) + params
# notice that here we added "Products" to the key with the "." notation
# we also updated the type
payload = json.dumps({
"key": "281.Products.Products",
"Contains PII": "Yes",
"Line of Business": [
"Research"
],
"Notes": "You're awesome"
})
headers = {
'token': 'Ro37-ngigc53-5oRKgJE3bqYuAgw4i9vqiIK3EXel6k',
'Content-Type': 'application/json',
'Cookie': 'sessionid=pt0p1gtp3ektcy6zuu1ki52eghvhcngi'
}
response = requests.post(url, headers=headers, data=payload)
print(response.json())
In Alation, this will look like...
Updating Docstore Schemas w/ Nested Objects
JSON objects can have nested objects within themselves. It is a simple solve as we change the object_type to docstore_schema and then keep adding a "." to each field name. e.g. we can see 281.Products.Products.Collection Item
has priceDate
object with $date
nested inside. This would make the key 281.Products.Products.Collection Item.priceDate.$date
.
NOTE: If there is a "." in the name, you can always escape that character with a "". Here is what this page looks like before the POST.
The ULM API also allows you to update the title
and description
of each object.
# this code is a continuation of the previous api call.
# changing the object type
object_type = "doc_schema"
# here we are adding the object type variable to the url dynamically
url = base_url + api.format(object_type) + params
# note that when POSTing multiple rows of data that this api expects
# JSON objects to be delimeted by "\n". These objects are NOT in an array/list, "[]"
# You can use JSON dumps, as seen above, to format this string
payload = "{\"key\": \"281.Products.Products.Collection Item.color\", \"Contains PII\": \"No\",\"title\": \"Color\" ,\"Description\": \"Vivamus suscipit id mauris sed pellentesque. Integer nec est non nulla feugiat eleifend non et lacus. Morbi vel erat mauris. In quis urna magna.\",\"Notes\":\"You're awesome\"}\n{\"key\": \"281.Products.Products.Collection Item._id\", \"Contains PII\": \"No\", \"title\":\"ID\", \"Notes\": \"You're awesome\"}\n{\"key\": \"281.Products.Products.Collection Item.priceDate\", \"title\":\"Price Date\", \"Description\": \"<b>Phasellus nec dignissim</b> lorem, ut sodales massa. Etiam tempus lobortis elementum. Sed mollis mauris odio, vitae hendrerit eros tristique pulvinar.\"}\n{\"key\": \"281.Products.Products.Collection Item.priceDate.$date\", \"Contains PII\": \"No\", \"title\":\"Date\", \"Description\": \"Example of embedded object\"}"
# here is the json payload broken out. Note how we handle the priceDate
# and priceDate.$date. We just added a period between to point this api to the
# correct object.
"""
{"key": "281.Products.Products.Collection Item.color", "Contains PII": "No","title": "Color" ,"Description": "Vivamus suscipit id mauris sed pellentesque. Integer nec est non nulla feugiat eleifend non et lacus. Morbi vel erat mauris. In quis urna magna.","Notes":"You're awesome"}
{"key": "281.Products.Products.Collection Item._id", "Contains PII": "No", "title":"ID", "Notes": "You're awesome"}
{"key": "281.Products.Products.Collection Item.priceDate", "title":"Price Date", "Description": "<b>Phasellus nec dignissim</b> lorem, ut sodales massa. Etiam tempus lobortis elementum. Sed mollis mauris odio, vitae hendrerit eros tristique pulvinar."}
{"key": "281.Products.Products.Collection Item.priceDate.$date", "Contains PII": "No", "title":"Date", "Description": "Example of embedded object"}
"""
response = requests.post(url, headers=headers, data=payload)
print(response.json())
"""
Output would look like
{
"number_received": 4,
"updated_objects": 4,
"new_objects": 0,
"error_objects": [],
"error": ""
}
"""
Here is the updated docstore_schema page.
NOTE: We did not update the
Collection Item
docstore schema custom fields.
Updated 7 months ago