Exporting data in Attribution

How to work with raw data exports from Attribution

πŸ“˜

This doc is for data analysts and developers

  • The data that can be exported and it's format
  • How Attribution delivers data to you
  • Custom reports

Attribution allows for the export of various raw data points and data generated by Attribution itself to be used by data analysts at your company. You can use this data to build custom reports and have deeper insights into your raw data. Please be aware that raw data exports are unattributed without any model applied to it, if you'd like to create a model with this data you'd have to build it yourself.

The data

Data consists of a number of linked tables that are exported in Apache Parquet files, which are best suitable for cloud warehouses (CSV export format is also possible, for more supported options see Amazon Redshift UNLOAD command).

Data is exported in two ways:

  • Incrementally – every day only changes to the table will be dumped. This applies to raw data which is sent to our tracking endpoint by your website or app and Integrations spend amounts.
  • Full resyncs – table data is exported in full on every run. These tables are dynamically generated by Attribution which includes:
    • Filters represent filter tree structure as in dashboard
    • Visits (sessions) with binding to filters.

Full list of the exported tables:

  • Events
  • Properties
  • Parameters
  • Users
  • Visitors
  • Companies
  • Browsers
  • Spend
  • Filters
  • Visits
  • Impressions

More details about each table structure is documented in Data schema article.

Delivery

Attribution provides daily dumps of your data into cloud storage provider of your choice. You will be responsible for providing access to your storage where the data will stored. Attribution requires full read/write access to your bucket, please make sure that no other data is stored in that bucket since it could be delete by sync process. We support delivering data to next could storages:

Once data is delivered to your bucket you're fully responsible what you do next without your data. But next logical step is to load it into your database for further analysis.

Properly doing ETL (extract, transform, load) procedure and configuring data pipeline could be challenging (see Manually loading data export), so we offer our ETL service which could do it automatically for popular analytical databases:

Visitor level tracking - Enriching your data

One primary difference between Attribution and other tools is the fact that Attribution tracks on a visit level whereas other tools will typically track on a user level.

In other words, Attribution tracks to answer the question, you spent X and got Y conversions / revenue, which you can do on the visitor level because Attribution is filtering by URL. Check out the article here to learn more on how Attribution filters visits.

Alternatively a lot of tools will already track on the user level, which allows you to say, 'user X completed XYZ event before reaching Y.'

This lets you answer behavioral questions like 'Which visitors spent the most over time?' Or 'How many of the visitors who subscribed for the blue plan, later completed event X'.

With Attribution you can enrich this data by combining your user level data with Attribution's campaign level data (spend, visits etc.). This combination allows data scientist to answer behavioral and financial questions like, 'Which visitors spend the most over time and how much did it cost us in Google ads to get there?’ or 'Which type of our users generate the most MRR after accounting for advertising?'

If you have any questions please feel free to reach out to [email protected]