Connections for BI Users
(GDS, PowerBI, Tableau, others)
Pooja Kothari avatar
Written by Pooja Kothari
Updated over a week ago

Can I scale beyond Google Sheets to my favorite BI tool?

Connections also allow you to connect a dedicated DataHawk database to your BI tools and other destinations, allowing you access to our rich, raw data. These tools include Power BI, Looker, Tableau, and more. Click here for a complete list of compatible tools.


What kind of data will you be able to access?

  • All of the data you track on your DataHawk account through the app is also available with Connections, including:

    • Tracked products and keywords (listings, sales ranks, organic ranks, sponsored ranks on a daily basis)

    • Seller Central account data (orders, profit & loss, financial events)

    • Advertising data (aggregated per account, campaign or product)

  • And of course, the data will come with custom-made KPIs (i.e., sales estimates, keyword search volume)

Set-Up & Quickstart

Setting up is easy and involves only a few steps. Follow the step-by-step guide:


How to navigate the database?

Once connected to your BI tool, you'll discover six different schemas/folders filled with specific tables.

Overview of the accessible schemas

  1. Referential - everything related to your DataHawk account and more!

  2. Product - monitor and benchmark changes in products details and performance

  3. SEO - track, benchmark, and optimize products' organic and sponsored search performance

  4. Finance [requires a Seller Central source] - run easier 3P sales reporting and correlate sales data with other metrics

  5. Advertising [requires an Advertising source] - monitor and analyze your ads performance

  6. Market - [requires a Category source] - monitor daily best seller rank and estimated sales for the top 100 products of a chosen category with a two-year history

  7. Datafeeds - specific data feeds used for google sheets automatized dashboards

Important Data Specifications

General Information

Available data in the database depends on what is currently active on your DataHawk workspace: tracked products, tracked keywords & connected accounts.

For ease of understanding and documentation purposes, the datasets here are referred to as "tables" instead of views.

Collection Rate: We collect tracked items daily from Amazon or Walmart.

Refresh Rate: We update your database on an hourly basis.

Often Used Columns: Here is a list of frequently used columns within the tables and their definition/purpose:

  • Workspace:

    workspace_id is your DataHawk workspace identifier.

  • Marketplace:

    marketplace_key is the marketplace identifier (Amazon-US, Amazon-FR, Walmart-US, etc.).

  • Products:

    channel_product_id is the product identifier of a platform (ASIN for Amazon, walmart_product_id for Walmart).

    product_key is the unique identifier of a channel_product_id in a specific marketplace. (channel_product_id + marketplace_key β‡’ product_key)

  • Keywords:

    keyword_key is the unique identifier of a keyword in a specific marketplace. A best practice is to use this table to join keyword-related tables. (keyword + marketplace_key β‡’ keyword_key)

  • Finance & Advertising:

    account_key is the Amazon Seller Central identifier of a connected account. The same identifier is also used on Amazon Advertising accounts.

  • Sales ranks & browse nodes:

    channel_browse_node_id is the browse node identifier of a platform (amazon_id for Amazon, categoryId for Walmart).

    browse_node_id is our internal identifier for a browse node on Amazon or Walmart in a specific marketplace. It is defined by a channel_browse_node_id and a marketplace. (channel_browse_node_id + marketplace_key β‡’ browse_node_id)

  • Dates:

    All the timestamps columns are in UTC, except when explicitly stated otherwise (i.e., purchase_local_time).

    observation_time is the exact point of time when we observed the item (product, keyword, or rank) on the targeted marketplace. Due to the timestamp details, avoid using this column to join different data sources, as you'll risk losing some information.

    observation_date is the date when we observed the item. It is a column useful for joins between data sources as it is on a date level rather than a timestamp level.

    date_day is the date column we use for aggregated tables. Some of our tables are not based on events or observed web pages. Instead, they are based on an aggregated reports fetched (i.e., Advertising data). We use date_day as a date column for all types of daily aggregations.

    date_month has the same purpose as date_day but on a monthly basis. It shows the date of the first day of the month, and we use it for monthly aggregated tables .

    purchase_time / purchase_date / purchase_local_time are the date columns used for Seller Central orders. For order data, we have detailed information that goes as deep as the actual timestamp at which the order went through.

    event_date is the date column used for detailed financial events.

    sync_time is a technical column. It displays the timestamp when we ingested the data in our data warehouse. Sometimes data gets reprocessed for quality reasons, due to which this column gets updated. In addition to it being made available for data engineering reasons, it also ensures transparency about the freshness of the data, along with the time it took for it to go from collection to availability.

Referential - Everything related to your DataHawk workspace & more!

This schema contains:

  1. Information relative to your DataHawk workspace: tracked items, tags, Seller Central & Advertising accounts connected, projects.

  2. Repositories about marketplaces, browse node tree, and currency rate. These are useful to add extra information to other data sources, such as browse node details for sales rank data or currency rates to sales data.

REFERENTIAL_ADVERTISING_ACCOUNT

This table lists all the Advertising accounts you connected to DataHawk and general information about these accounts. Connecting these accounts will populate everything under Advertising.

REFERENTIAL_BROWSE_NODE_TRACKED

List of browse nodes you are currently tracking on the DataHawk's Connections.

A browse node is defined by a name (browse_node_name) and an id (browse_node_id). This will be enabled soon for this version of Connections.

REFERENTIAL_BROWSE_NODE_TREE

This table is our referential of all browse nodes on Amazon and Walmart and their hierarchies/paths. You can find information such as the path, the node level, the name or the last observation date.

A browse node is defined by its browse_node_id and can have several paths to get there.

REFERENTIAL_CURRENCY_RATE

This table is a matrix of dates and currency rates.

Since the first of January 2016, you have a currency, a base, and a rate for every date; the base is always USD.

All the data in the Finance and Advertising tables are in the marketplace's currency. If you want it in dollars, you can divide the sales column by the rate.

Here's a sample query that gets the sales in dollars for one of the Finance tables.

select f.*, f.sales/c.rate as sales_in_dollars from "DATAHAWK_SHARE_12345"."REFERENTIAL"."REFERENTIAL_CURRENCY_RATE" as C join "DATAHAWK_SHARE_12345"."FINANCE"."FINANCE_ACCOUNT_METRICS" as F on f.date_day=c.date_day and f.currency=c.currency

On the 27th of February 2022, 1 USD was worth 91.77 AFN.

REFERENTIAL_FINANCE_ACCOUNT

This table lists all the Seller Central (Finance) accounts you connected to DataHawk. Connecting these accounts will populate everything under FINANCE.

REFERENTIAL_KEYWORD_TAG

This table lists all the keywords tracked and the corresponding tags you've put on DataHawk.

REFERENTIAL_KEYWORD_TRACKED

This table lists all the keywords you tracked on DataHawk and information about when you started tracking them.

REFERENTIAL_MARKETPLACE

This is the list of all covered marketplaces and relative information about them.

REFERENTIAL_PRODUCT_TAG

This table lists all the products tracked and the corresponding tags you have assigned on DataHawk.

REFERENTIAL_PRODUCT_TRACKED

This table lists all the products you are currently tracking on DataHawk. If you want to enable the PRODUCT tables, you need to have tracked products on the platform beforehand.

REFERENTIAL_PROJECT

All the projects you've created on DataHawk and the keywords and products within each of them.

Product - Monitor and benchmark changes in products details and performance

PRODUCT_BADGES

This table contains all the different badges' name and type that we observed on your products on any given day since they were tracked.

PRODUCT_DETAILED

This table contains historical data of the daily product listings for all the products you track on our platform.

Every day we fetch the following information:

Marketplace, channel product id, group channel product id, date, time, name, price, brand, rating, rating count, buy box quantity, sales estimates, variations, image count, description, bullet points, rating percentages, A+ detection, dimensions, weight.

The following screenshot only shows a small part of it.

PRODUCT_KEYWORD_RESEARCH

This table looks at all of DataHawk's data over the last 7 days and counts every time we saw your products appear in all of DataHawk's keywords.

This allows you to quickly know which keywords are of interest and that you might want to track.

PRODUCT_LISTING_QUALITY_SCORE

This table shows a daily Listing Quality Score of your tracked products. This homemade KPI scans your product listings to give you an elaborate score.

PRODUCT_LISTING_QUALITY_SCORE_BREAKDOWN

It unveils a breakdown of the listing quality score for you to understand which part of your listing could be improved and how the score is being computed. The following screenshot shows a small part of the details available in this table.

PRODUCT_SALES_RANK

This table contains all the different browse nodes and sales ranks that we observed on your products on any given day since they were tracked.

NB: We are sourcing this from 2 different parts of Amazon: the product listings and the top 100 Best Seller Rank. As for Walmart, this is only sourced from the top 100 BSR.

PRODUCT_VARIATION_LATEST

This table shows the latest version of the variations of your tracked products. For every product tracked (channel_product_id), you have its latest parent (group_channel_product_id). This is, again, scoped to a marketplace (marketplace_key).

SEO - Track, benchmark, and optimize products' organic search performance

SEO_KEYWORD_VOLUME

A homemade KPI that estimates the search volume of the keywords you are tracking on a monthly basis.

SEO_PRODUCT_RANK_ORGANIC

Daily organic ranking (pages 1 to 5) of your tracked keywords and all visible data on the keyword page (channel product id, product name, number of ratings, rating, prime badge).

SEO_PRODUCT_RANK_SPONSORED

Daily sponsored ranking page of your tracked keywords along with all visible data on the keyword page (ASIN, product name, number of ratings, rating, prime badge).

Finance [requires a Seller Central source] - Run easier 3P sales reporting and correlate sales data with other metrics

FINANCE_ACCOUNT_METRICS

This table shows all the aggregate of successful orders that occurred for a certain account per day. It's a high-level aggregated dashboard to follow what's happening on an account level.

FINANCE_EVENTS

This table lists all financial events by day, account key, and marketplace.

Some financial events can be attached to an order id and an ASIN. The column metric maps an Amazon Financial Event with its amount.

This is a detailed table to help you go as deep as a reimbursed gift wrap event.

FINANCE_ORDERS

This table shows all the orders (successful & canceled) that occurred for a specific account per day. The main difference with the FINANCE_ACCOUNT_METRICS table is that this isn't aggregated data. It shows the information on an order level with columns like PURCHASE_TIME, showing the actual time that the order went through.

FINANCE_ORDER_SUCCESS

This table shows all the successful orders that occurred for a specific account per day. It is the same table as FINANCE_ORDERS, except that it's filtered to show only the successful (status = "Shipped") orders.

FINANCE_PRODUCT_METRICS_DAILY

This table aggregates orders on a product level daily. It's built to follow the performance of a product every day: number of orders, number of units sold, sales.

FINANCE_PRODUCT_METRICS_MONTHLY

This table aggregates orders on a product monthly level. It's built to follow the performance of a product every month: number of orders, number of units sold, sales.

FINANCE_PROFIT_LOSS

This table aggregates all profit & loss metrics and their amount per account key and day.

Advertising [requires an Advertising source] - Monitor and analyze your Ads performance

ADVERTISING_ACCOUNT_METRICS

Advertising data aggregated by account and date, and sponsoring type.

ADVERTISING_CAMPAIGN_METRICS

Advertising data aggregated by account, campaign, date, and sponsoring type.

ADVERTISING_PRODUCT_METRICS

Advertising data aggregated by account, date, product, and sponsoring type.

Marketing [requires a Category source] - Monitor best seller rank and estimated sales changes of a tracked category

MARKET_PRODUCT_SALES_ESTIMATES

Daily estimated sales of products per tracked category.

MARKET_BROWSE_NODE_SALES_ESTIMATES

Monthly total estimated sales per tracked category.

MARKET_BEST_SELLER_RANK

Daily sales ranks of products per tracked category.

Datafeeds - Access advanced views built for automated reports on Google Sheets

This schema carries the datasets we use to fill our automated reports on Google Sheets. They're structured in a compatible manner and, for clients that have a vast amount of Sources, may have truncated data. So be sure to use the other schemas if you've scaled past this point. To check, the first value of the first cell of any dataset in this schema would read "Truncated".

You can see which Google Sheets dashboard it is referring to by looking at the name of the table.

Naming convention: feed_targetdashboardname_tablename

Example: feed_profit_loss_product_detailed β‡’ profit & loss dashboard, product detailed table.

Did this answer your question?