Skip to main content

Salesforce

Module salesforce

Incubating

Important Capabilities

CapabilityStatusNotes
Data ProfilingOnly table level profiling is supported via profiling.enabled config field
Detect Deleted EntitiesNot supported yet
DomainsSupported via the domain config field
Platform InstanceCan be equivalent to Salesforce organization

Prerequisites

In order to ingest metadata from Salesforce, you will need:

  • Salesforce username, password, security token OR
  • Salesforce instance url and access token/session id (suitable for one-shot ingestion only, as access token typically expires after 2 hours of inactivity)

The account used to access Salesforce requires the following permissions for this integration to work:

  • View Setup and Configuration
  • View All Data

Integration Details

This plugin extracts Salesforce Standard and Custom Objects and their details (fields, record count, etc) from a Salesforce instance. Python library simple-salesforce is used for authenticating and calling Salesforce REST API to retrive details from Salesforce instance.

REST API Resources used in this integration

Concept Mapping

This ingestion source maps the following Source System Concepts to DataHub Concepts:

Source ConceptDataHub ConceptNotes
SalesforceData Platform
Standard ObjectDatasetsubtype "Standard Object"
Custom ObjectDatasetsubtype "Custom Object"

Caveats

  • This connector has only been tested with Salesforce Developer Edition.
  • This connector only supports table level profiling (Row and Column counts) as of now. Row counts are approximate as returned by Salesforce RecordCount REST API.
  • This integration does not support ingesting Salesforce External Objects

CLI based Ingestion

Install the Plugin

pip install 'acryl-datahub[salesforce]'

Starter Recipe

Check out the following recipe to get started with ingestion! See below for full configuration options.

For general pointers on writing and running a recipe, see our main recipe guide.

pipeline_name: my_salesforce_pipeline
source:
type: "salesforce"
config:
instance_url: "https://mydomain.my.salesforce.com/"
username: user@company
password: password_for_user
security_token: security_token_for_user
platform_instance: mydomain-dev-ed
domain:
sales:
allow:
- "Opportunity$"
- "Lead$"

object_pattern:
allow:
- "Account$"
- "Opportunity$"
- "Lead$"

sink:
type: "datahub-rest"
config:
server: "http://localhost:8080"

Config Details

Note that a . is used to denote nested fields in the YAML recipe.

View All Configuration Options
FieldRequiredTypeDescriptionDefault
envstringThe environment that all assets produced by this connector belong toPROD
platform_instancestringThe instance of the platform that all assets produced by this recipe belong toNone
authenum(SalesforceAuthType)Allowed symbols are USERNAME_PASSWORD, DIRECT_ACCESS_TOKENUSERNAME_PASSWORD
usernamestringSalesforce usernameNone
passwordstringPassword for Salesforce userNone
security_tokenstringSecurity token for Salesforce usernameNone
instance_urlstringSalesforce instance url. e.g. https://MyDomainName.my.salesforce.comNone
is_sandboxbooleanConnect to Sandbox instance of your SalesforceFalse
access_tokenstringAccess token for instance urlNone
ingest_tagsbooleanIngest Tags from source. This will override Tags entered from UIFalse
platformstringsalesforce
object_patternAllowDenyPattern (see below for fields)Regex patterns for Salesforce objects to filter in ingestion.{'allow': ['.*'], 'deny': [], 'ignoreCase': True}
object_pattern.allowArray of stringList of regex patterns to include in ingestion['.*']
object_pattern.denyArray of stringList of regex patterns to exclude from ingestion.[]
object_pattern.ignoreCasebooleanWhether to ignore case sensitivity during pattern matching.True
domainDict[str, AllowDenyPattern]Regex patterns for tables/schemas to describe domain_key domain key (domain_key can be any string like "sales".) There can be multiple domain keys specified.{}
domain.key.allowArray of stringList of regex patterns to include in ingestion['.*']
domain.key.denyArray of stringList of regex patterns to exclude from ingestion.[]
domain.key.ignoreCasebooleanWhether to ignore case sensitivity during pattern matching.True
profilingSalesforceProfilingConfig (see below for fields){'enabled': False}
profiling.enabledbooleanWhether profiling should be done. Supports only table-level profiling at this stageFalse
profile_patternAllowDenyPattern (see below for fields)Regex patterns for profiles to filter in ingestion, allowed by the object_pattern.{'allow': ['.*'], 'deny': [], 'ignoreCase': True}
profile_pattern.allowArray of stringList of regex patterns to include in ingestion['.*']
profile_pattern.denyArray of stringList of regex patterns to exclude from ingestion.[]
profile_pattern.ignoreCasebooleanWhether to ignore case sensitivity during pattern matching.True

Code Coordinates

  • Class Name: datahub.ingestion.source.salesforce.SalesforceSource
  • Browse on GitHub

Questions

If you've got any questions on configuring ingestion for Salesforce, feel free to ping us on our Slack